netcdf4
Date: May 24th 2016
Last updated: May 24th 2016
netcdf4 is a dataset containing N dimensional data stored as dictionary attributes. Variable objects are treated like a numpy array.
I have included the installation of netcdf4 with the installation of basemap (see installing basemap). I have done so because many of the larger datasets that I wanted to map using basemap come in the form of a netcdf4 dataset (e.g. global wave height models).
This entry provides some basic variable selection from an existing netcdf4 dataset.
Useful resources
- https://www.unidata.ucar.edu/software/netcdf/workshops/2012/netcdf_python/netcdf4python.pdf
- http://unidata.github.io/netcdf4-python/
- https://www.getdatajoy.com/learn/Read_and_Write_NetCDF_Files_from_Python
collect data
from netCDF4 import Dataset
d = Dataset(url)
# e.g. # global pressure system data
URL = """http://nomads.ncep.noaa.gov:9090/
dods/gfs_0p25_1hr/gfs20160523/
gfs_0p25_1hr_12z"""
d = Dataset(URL)
netcdf4 data object
d
"""
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF3_CLASSIC data model, file format UNDEFINED):
title: GFS 0.25 deg Hourly starting from 12Z23may2016, downloaded May 23 16:07 UTC
Conventions: COARDS
GrADS
dataType: Grid
history: Mon May 23 18:19:40 UTC 2016 : imported by GrADS Data Server 2.0
dimensions(sizes): lat(721), lev(31), lon(1440), time(121)
variables(dimensions): float64 time(time), float64 lev(lev), float64 lat(lat), float64 lon(lon),
...<-snipped->
"""
Get info of netcdf4 dataset
for line in d.__doc__.split('\n'):
print(line)
# output snipped into pieces below
# 1. list of attributes
"""A list of attribute names corresponding to
global netCDF attributes defined for the
`netCDF4.Dataset` can be obtained with the
`netCDF4.Dataset.ncattrs` method."""
d.ncattrs()
['title', 'Conventions', 'dataType', 'history']
# 2. dictionary of name/value pairs
"""A dictionary containing all the netCDF
attribute name/value pairs is provided by
the `__dict__` attribute of a `netCDF4.Dataset`
instance."""
d.__dict__
OrderedDict([('title', 'GFS 0.25 deg Hourly starting from 12Z23may2016, downloaded May 23 16:07 UTC'), ('Conventions', 'COARDS\nGrADS'), ('dataType', 'Grid'), ('history', 'Mon May 23 18:19:40 UTC 2016 : imported by GrADS Data Server 2.0')])
details of netcdf4 object
# refer to dataset.__doc__ for more info
# dimensions
d.dimensions
OrderedDict([('lat', <class 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 721
), ('lev', <class 'netCDF4._netCDF4.Dimension'>: name = 'lev', size = 31
), ('lon', <class 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 1440
), ('time', <class 'netCDF4._netCDF4.Dimension'>: name = 'time', size = 121
)])
# variables (can be a very long object)
d.variables
# tail end of variables printed to the terminal
"""
<snipped>...
), ('wiltsfc', <class 'netCDF4._netCDF4.Variable'>
float32 wiltsfc(time, lat, lon)
_FillValue: 9.999e+20
missing_value: 9.999e+20
long_name: ** surface wilting point [fraction]
unlimited dimensions:
current shape = (121, 721, 1440)
filling off
), ('var00212m', <class 'netCDF4._netCDF4.Variable'>
float32 var00212m(time, lat, lon)
_FillValue: 9.999e+20
missing_value: 9.999e+20
long_name: ** 2 m above ground desc [unit]
unlimited dimensions:
current shape = (121, 721, 1440)
filling off
)])"""
# groups
d.groups
OrderedDict()
# format
d.file_format
'NETCDF3_CLASSIC'
d.data_model
'NETCDF3_CLASSIC'
list variable names
for var in d.variables:
print("{}".format(var))
"""
time
lev
lat
lon
absvprs
no4lftxsfc
no5wavh500mb
acpcpsfc
albdosfc
apcpsfc
capesfc
cape180_0mb
cape255_0mb
cfrzrsfc
cicepsfc
cinsfc
cin180_0mb
cin255_0mb
clwmrprs
cpofpsfc
cpratsfc
crainsfc
csnowsfc
cwatclm
cworkclm
dlwrfsfc
dpt2m
dswrfsfc
fldcpsfc
gfluxsfc
gustsfc
hgtsfc
hgtprs
hgt2pv
hgtneg2pv
hgttop0c
hgt0c
hgtmwl
hgttrop
hindexsfc
hlcy3000_0m
hpblsfc
icahtmwl
icahttrop
icecsfc
icsevprs
landsfc
lftxsfc
lhtflsfc
msletmsl
o3mrprs
pevprsfc
plpl255_0mb
potsig995
pratesfc
preslclb
preslclt
presmclb
presmclt
preshclb
preshclt
pressfc
pres80m
pres2pv
presneg2pv
prescclb
prescclt
presmwl
prestrop
prmslmsl
pwatclm
rhprs
rh2m
rhsg330_1000
rhsg440_1000
rhsg720_940
rhsg440_720
...<- snipped ->
"""
Access data for one variable
# access a variable
d.variables['prmslmsl']
"""
<class 'netCDF4._netCDF4.Variable'>
float32 prmslmsl(time, lat, lon)
_FillValue: 9.999e+20
missing_value: 9.999e+20
long_name: ** mean sea level pressure reduced to msl [pa]
unlimited dimensions:
current shape = (121, 721, 1440)
filling off
"""
# not the attributes are selectable
d.variables['prmslmsl'].long_name
'** mean sea level pressure reduced to msl [pa] '
# get array of data
d.variables['prmslmsl'][0]
array([[ 100503.9296875, 100503.9296875, 100503.9296875, ...,
100503.9296875, 100503.9296875, 100503.9296875],
[ 100437.6875 , 100437.8515625, 100438.0078125, ...,
100437.3671875, 100437.53125 , 100437.6875 ],
[ 100378.8046875, 100379.125 , 100379.2890625, ...,
100378.1640625, 100378.328125 , 100378.6484375],
...,
[ 102444.0859375, 102443.9296875, 102443.765625 , ...,
102444.40625 , 102444.25 , 102444.25 ],
[ 102439.4453125, 102439.4453125, 102439.4453125, ...,
102439.765625 , 102439.609375 , 102439.609375 ],
[ 102435.9296875, 102435.9296875, 102435.9296875, ...,
102435.9296875, 102435.9296875, 102435.9296875]], dtype=float32
# splicing rules apply
d.variables['prmslmsl'][0][0]
array([ 100503.9296875, 100503.9296875, 100503.9296875, ...,
100503.9296875, 100503.9296875, 100503.9296875], dtype=float32)
# single value at position indexed at zero
d.variables['prmslmsl'][0][0][0]
100503.93
get long names of variables
for var in d.variables:
print("{} ({})".format(var,\
d.variables[var].long_name))
"""
<- snipped ->
...
preshclb (** high cloud bottom level pressure [pa] )
preshclt (** high cloud top level pressure [pa] )
pressfc (** surface pressure [pa] )
pres80m (** 80 m above ground pressure [pa] )
pres2pv (** pv=2e-06 (km^2/kg/s) surface pressure [pa] )
presneg2pv (** pv=-2e-06 (km^2/kg/s) surface pressure [pa] )
prescclb (** convective cloud bottom level pressure [pa] )
prescclt (** convective cloud top level pressure [pa] )
presmwl (** max wind pressure [pa] )
prestrop (** tropopause pressure [pa] )
prmslmsl (** mean sea level pressure reduced to msl [pa] )
pwatclm (** entire atmosphere (considered as a single layer) precipitable water [kg/m^2] )
rhprs (** (1000 975 950 925 900.. 7 5 3 2 1) relative humidity [%] )
rh2m (** 2 m above ground relative humidity [%] )
...
<- snipped ->
"""