Hydrographic Climate Database (Climate)
The Climate Database
The Ocean Science Hydrographic Climate Database (Climate) is a comprehensive, open access collection
of temperature and salinity data for the Northwest Atlantic and Eastern Arctic, an area defined by
35°N - 80° N and 42°W - 100° W. The data come from a variety of sources including hydrographic bottles,
CTD casts, profiling floats, spatially and temporally averaged Batfish tows, and expendable, digital
or mechanical bathythermographs. Near real-time observations of temperature and salinity from the Global
Telecommunications System (GTS) are also included. The database currently consists of approximately
850,000
profiles and 35 million individual observations from 1910 to the present. Vertical resolution varies
from 1 m (for CTDs) to ~ 10-100 m for traditional bottle casts. Climate is updated monthly and approximately
20,000 new profiles are added each year.
Validation
Initial validation is carried out by the
originating institute or organization. All data, whether from Canadian or foreign sources
are also validated by the Marine Environmental Data Service (MEDS), the national data
center for the Department of Fisheries and Oceans. The primary validation procedures at
MEDS are described in the IOC publication GTSPP Real Time Quality Control
Manual.
At BIO, the data are subjected to a set of final tests before
being incorporated into the database. One of the primary functions of this validation is
the determination and elimination of duplicate profiles.
For climatological purposes, we have defined a duplicate as any
profile which is within 0.02° of latitude and 0.03° of longitude (roughly 3
km.) and 30 minutes time of another profile.
Determining which duplicate to select is based on a data type
hierarchy. A CTD down cast is at the highest level, down through bottle casts, the
various BT types, and finally the low resolution IGOSS Tesac and Bathy messages. Low
resolution Bathy and Tesac messages get replaced with the higher resolution CTD or
XBT data as they become available after having worked their way through the collecting
agency, national data center if outside of Canada, and finally to MEDS. This process
may take a number of years before we receive the final version of the data to replace
the near real-time IGOSS data. If data type does not reconcile duplicate selection,
selection is based on a progression of selecting Canadian data over foreign, selecting
the profile with the greatest number of observations, and finally, selecting the profile
with the greatest depth.
Profiles which have been flagged as having failed the MEDS QC
are individually examined to determine if any may be salvaged. There is no attempt
to correct erroneous data, however individual data levels may be discarded and a portion
of a profile retained.
In addition, the entire database is subjected to various ongoing
subjective and objective tests to improve the overall confidence in the data. Stations
with individual observations 3 standard deviations outside of the mean value derived
from a 1° grid averaged over depths ranging from 25 meters at the surface to 500
meters for the deep ocean on a seasonal basis have been individually examined and removed
when deemed appropriate.
Interpolated values During the early 1970's, data
was sent to MEDS from BIO as inflection points based on a liner regression tolerance
of 0.01° or 0.01 psu. Also during the period 1969-89, MEDS had a limit of 99 levels
for a single CTD profile which they ensured by using a similar reduction technique. As
a consequence, much of the CTD data during this period was at a much reduced resolution.
As a one-time correction, all CTD data in the database with an average depth resolution
(maximum depth/ # of observations) of less than 5 meters were interpolated to include
values at standard oceanographic depths.
References
Gregory, D.N. 2004.
Climate: A Database of Temperature and Salinity Observations for the Northwest Atlantic
DFO Can. Sci. Advis. Sec. Res. Doc. 2004/075
Department of Fisheries and Oceans, Marine Environmental Data Service web site, Manuals
and Guides #22, GTSPP Real-time Quality Control Manual: http://www.meds-sdmm.dfo-mpo.gc.ca/ALPHAPRO/gtspp/qcmans/MG22/guide22_e.htm
The Climate Application
This application extracts information from the Ocean Sciences hydrographic database
according to user specified spatial and temporal criteria. Output results can be either
statistical summaries of the data or the actual data stored in the database (for input
into your own analyses). The query is performed off-line. You will be contacted by email
when your results become available. Results should normally be available within a few
hours (depending upon the size of your query and the number of requests ahead of you).
If you haven't had a reply within 24 hours, contact us and we will try to determine
what happened.
The query screen is the "home base" for the application.
Most of the fields on the query form are linked to help text that can be displayed at any time.
Processing options include the ability to select only those records
that contain both temperature and salinity observations and an option to average the values
within a profile according to the depth specification. This reduces the resolution of highly
sampled data to more closely resemble observations sampled much less frequently.
Users can request a number of different data products which include
a station index of latitude, longitude and date/time for each profile selected, individual
observations making up the profile, time series based on monthly averages within the latitude,
longitude, depth volume, or a seasonal cycle based on averages over all months from the time series
statistics.
A Brief Tour
(A) Query Identification All of your queries are
assigned and unique query number and saved under your username. You can re-run existing
queries or edit them and submit them as new queries. You can assign a name under TITLE.
This name is saved internally in the results files for your reference.
(B) Area Selection Type You may define a geographical
area in one of 3 ways;
- choose from a list of predefined polygons (multiple selections are permitted).
The pre-defined polygons are shown in areas.html.
- provide your own polygon definition by latitude/longitude co-ordinates. To do this,
select the Define Area button at the top of your screen, follow the instructions to
create a new polygon, and then select it from the list of predefined polygons.
- define a rectangle by latitude/longitude coordinates. The blocks parameter permits
one to subdivide the entire rectangle into x by y blocks. For example,
specifying latitude from 42° to 45° and longitude from -62° to -65°
with 1° blocks in both latitude and longitude would result in defining 9 separate
1° grid squares for which statistics would be generated.
Note that the convention for longitude is positive East. Latitude
and Longitude must be specified as decimal degrees, but we provide a converter if you
prefer degrees, minutes, seconds.
A note on blocking (gridding)
If lat long blocking is not requested (i.e., the lat_block field and
lon_block fields are left blank) the output will contain all the data
within the search rectangle, including points that lie on all the boundaries
of the rectangle.
If blocking is requested then the following algorithm will be used.
The bins of user specified size will be created originating from point (0,0).
The data that belong to the user specified rectangle will then be distributed
among these bins. Data on the left and upper boundary of the bin are
excluded. These points are assigned to the adjacent bin. The final output
will only include the data that belong to the bins whose centers are within or
on the boundary of the rectangle.
As a result, depending on the rectangle co-ordinates and the block size,
the out put may exclude part of the data that belong to the specified rectangle.
For example, in scenario (a) of the above diagram, the boundaries of
the specified rectangle align exactly on the gridlines. The points along the
dotted boundaries are excluded because the centers of the bins that contain these
points lie outside the rectangle.
In example (b), the boundaries of the rectangle are on the center of
the surrounding grid. All the points that are within the rectangle as well
as those on the boundaries of the rectangle will be included. However,
it is important to note that the bins that are along the boundaries would
include only the data for the section of the grid that are within the rectangle.
In case (c), the output would exclude the data on the dotted
lines and in the shaded area as the centers of the bins that contain these
points lie outside the rectangle.
(C) Time Specification Specify a continuous time period
from (month/day/year) to (month/day/year) and/or a seasonal window. Months are inclusive.
Months 6 to 9 means June to September. Months 12 to 2 would select only December to
February. Default is all months for the entire time period for which there are data.
(D) Depth Specification Depth ranges are specified by
entering a string of comma-separated fields defining;
First_bin, Bin_size, Bin_interval, Last_bin
The string is read as "Starting with First_bin, plus/minus Bin_size, repeat every Bin_interval until
Last_bin". Multiple depth specifications can be included for each query.
Example
0, 5, 25, 100 150, 10, 50, 250
This will generate statistics for the bin ranges 0±5,
25±5, 50±5, 75±5, 100±5, 150±10, 200±10, 250±10
If not specified, the last two values (Bin_Interval and Last_bin) default to 0 and First_bin.
e.g. 500,500 will extract all data for 500±500 metres. This is the same as the old specification.
The numbers do not have to be integers.
Real number entries will be rounded to the second decimal place during
query processing. If you are specifying multiple depth specifications, separate
each set by pressing the enter key, ensuring that each set appears on a line by
itself.
(E) Processing Options There are two optional selection
criteria. One or both may be selected.
TS Only selects only records with contain both temperature and salinity
observations. Because we have a lot of XBT data, there are many more temperature
observations than salinity. The statistics for sigma-t are based on individual
computations. There are not based on the overall statistics of temperature and salinity.
Bin Averaging is appropriate only if you have requested the complete profile
data. This option averages the values within a profile according to the depth
specification. What this does is reduce the resolution of highly sampled data to
something more closely resembling observations sampled much less frequently. Suppose
for example that you intended to use the individual observations to optimally estimate
temperature. CTD data, sampled at every meter, would dominate bottle data sampled
every 25 meters. Specifying bin averaging and depth ranges of 10 or 20 meters would
result in getting a single average CTD observation for the 10 (or 20) meter level,
much closer to 25 to 50 resolution you would expect from a bottle.
(F) Product Selection
The result set files returned to you depend upon what options you request;
The standard deviation is calculated using the "nonbiased" or "n-1" method using
the following formula:
Seasonal Cycle, Time Series Will get both the Times series of monthly
statistics (average, minimum, maximum and count of observations (T,S,sigma)
for each year and month and depth level for which there are data) and Seasonal
Cycle (average, minimum. maximum, standard deviation, count of observations and
count of months in average). The values are determined by an un-weighted average
over all months from the time series statistics.
Complete Profile Data Extracts every value of temperature and salinity
referenced to depth, latitude, longitude and date. These files can be very
large. Many people request it simply because they can. Make sure it is what you
really require before requesting it. There are also some file size restrictions.
See Caveats.
Station Index Lists latitude, longitude and date/time for each profile selected.
The complete result set can consist of up to 5 ascii text
files. All files are of the form qry_xx.txt, where xx is your unique query identifier.
Files are comma delimited with a header label for easy import into a spreadsheet
or database application. The definition for each file and
explanation follows (see detailed file description).
(G) Run
After the query specification is complete, selecting Run will submit the query.
For a complete list of MEDS codes,
click
here.
|