The Climate Database
The Ocean Science hydrographic database is a collection of temperature and salinity
data for the area roughly defined by 35° - 80° N and 42° - 100°
W. The data comes from a variety of sources including hydrographic bottles, CTD casts
(either up or down casts), spatially and temporally averaged Batfish tows, and expendable,
digital or mechanical bathythermographs. Near real-time data in the form of IGOSS Bathy
or Tesac messages are also included. The database currently consists of approximately
782,000 profiles and 35 million individual observations from 1910 to the present. Updates
are made monthly.
Validation Initial validation is carried out by the
originating institute or organization. All data, whether from Canadian or foreign sources
are also validated by the Marine Environmental Data Service (MEDS), the national data
center for the Department of Fisheries and Oceans. The primary validation procedures at
MEDS are described in the IOC publication GTSPP Real Time Quality Control Manual.
At BIO, the data are subjected to a set of final tests before
being incorporated into the database. One of the primary functions of this validation is
the determination and elimination of duplicate profiles.
For climatological purposes, we have defined a duplicate as any
profile which is within 0.02° of latitude and 0.03° of longitude (roughly 3
km.) and 30 minutes time of another profile.
Determining which duplicate to select is based on a data type
hierarchy. A CTD down cast is at the highest level, down through bottle casts, the
various BT types, and finally the low resolution IGOSS Tesac and Bathy messages. Low
resolution Bathy and Tesac messages get replaced with the higher resolution CTD or
XBT data as they become available after having worked their way through the collecting
agency, national data center if outside of Canada, and finally to MEDS. This process
may take a number of years before we receive the final version of the data to replace
the near real-time IGOSS data. If data type does not reconcile duplicate selection,
selection is based on a progression of selecting Canadian data over foreign, selecting
the profile with the greatest number of observations, and finally, selecting the profile
with the greatest depth.
Profiles which have been flagged as having failed the MEDS QC
are individually examined to determine if any may be salvaged. There is no attempt
to correct erroneous data, however individual data levels may be discarded and a portion
of a profile retained.
In addition, the entire database is subjected to various ongoing
subjective and objective tests to improve the overall confidence in the data. Stations
with individual observations 3 standard deviations outside of the mean value derived
from a 1° grid averaged over depths ranging from 25 meters at the surface to 500
meters for the deep ocean on a seasonal basis have been individually examined and removed
when deemed appropriate.
Interpolated values During the early 1970's, data
was sent to MEDS from BIO as inflection points based on a liner regression tolerance
of 0.01° or 0.01 psu. Also during the period 1969-89, MEDS had a limit of 99 levels
for a single CTD profile which they ensured by using a similar reduction technique. As
a consequence, much of the CTD data during this period was at a much reduced resolution.
As a one-time correction, all CTD data in the database with an average depth resolution
(maximum depth/ # of observations) of less than 5 meters were interpolated to include
values at standard oceanographic depths.
Reference GTSPP Real Time Quality Control Manual
IOC Manuals and Guides No 22 UNESCO 1990
|
|
The Climate Application
This application extracts information from the Ocean Sciences hydrographic database
according to user specified spatial and temporal criteria. Output results can be either
statistical summaries of the data or the actual data stored in the database (for input
into your own analyses). The query is performed off-line. You will be contacted by email
when your results become available. Results should normally be available within a few
hours (depending upon the size of your query and the number of requests ahead of you).
If you haven't had a reply within 24 hours, contact us and we will try to determine
what happened.
A Brief Tour
(A) Query Identification All of your queries are
assigned and unique query number and saved under your username. You can re-run existing
queries or edit them and submit them as new queries. You can assign a name under TITLE.
This name is saved internally in the results files for your reference.
(B) Area Selection Type You may define a geographical
area in one of 3 ways;
- choose from a list of predefined polygons (multiple selections are permitted).
The pre-defined polygons are shown in areas.html.
- provide your own polygon definition by latitude/longitude co-ordinates. To do
this, select the Polygon button at the bottom of your screen, follow the instructions
to create a new polygon, and then select it from the list of predefined polygons.
- define a rectangle by latitude/longitude coordinates. The blocks parameter permits
one to subdivide the entire rectangle into x by y blocks. For example,
specifying latitude from 42° to 45° and longitude from -62° to -65°
with 1° blocks in both latitude and longitude would result in defining 9 separate
1° grid squares for which statistics would be generated.
Note that the convention for longitude is positive East. Latitude
and Longitude must be specified as decimal degrees, but we provide a converter if you
prefer degrees, minutes, seconds.
(C) Time Specification Specify a continuous time period
from (month/day/year) to (month/day/year) and/or a seasonal window. Months are inclusive.
Months 6 to 9 means June to September. Months 12 to 2 would select only December to
February. Default is all months for the entire time period for which there are data.
(D) Depth Specification Depth ranges are specified by
entering a string of comma-separated fields defining;
First_bin, Bin_size, Bin_interval, Last_bin
The string is read as "Starting with First_bin, plus/minus Bin_size, repeat every Bin_interval until
Last_bin". Multiple depth specifications can be included for each query.
Example
0, 5, 25, 100 150, 10, 50, 250
This will generate statistics for the bin ranges 0±5,
25±5, 50±5, 75±5, 100±5, 150±10, 200±10, 250±10
If not specified, the last two values (Bin_Interval and Last_bin) default to 0 and First_bin.
e.g. 500,500 will extract all data for 500±500 metres. This is the same as the old specification.
The numbers do not have to be integers.
Real number entries will be rounded to the second decimal place during
query processing. If you are specifying multiple depth specifications, separate
each set by pressing the enter key, ensuring that each set appears on a line by
itself.
(E) Processing Options There are two optional selection
criteria. One or both may be selected.
TS Only selects only records with contain both temperature and salinity
observations. Because we have a lot of XBT data, there are many more temperature
observations than salinity. The statistics for sigma-t are based on individual
computations. There are not based on the overall statistics of temperature and salinity.
Bin Averaging is appropriate only if you have requested the complete profile
data. This option averages the values within a profile according to the depth
specification. What this does is reduce the resolution of highly sampled data to
something more closely resembling observations sampled much less frequently. Suppose
for example that you intended to use the individual observations to optimally estimate
temperature. CTD data, sampled at every meter, would dominate bottle data sampled
every 25 meters. Specifying bin averaging and depth ranges of 10 or 20 meters would
result in getting a single average CTD observation for the 10 (or 20) meter level,
much closer to 25 to 50 resolution you would expect from a bottle.
(F) Navigation Buttons
Submit Query - Submits the query to the server.
View Queries - View previous queries. First, Previous,
Next and Last navigate through the queries.
New Query - Create a new query
Edit - Edit the current query. Edited queries can be re-submitted with
Submit Query. A new query number will be assigned
Polygons - Opens the polygon dialog to create new or edit existing polygons.
The new polygon will appear in the Polygon pick list.
Exit - Terminates the application
(G) Result Information Set
The result set files returned to you depend upon what options you request;
The standard deviation is calculated using the "nonbiased" or "n-1" method using
the following formula:
Seasonal Cycle, Time Series Will get both the Times series of monthly
statistics (average, minimum, maximum and count of observations (T,S,sigma)
for each year and month and depth level for which there are data) and Seasonal
Cycle (average, minimum. maximum, standard deviation, count of observations and
count of months in average). The values are determined by an un-weighted average
over all months from the time series statistics.
Complete Profile Data Extracts every value of temperature and salinity
referenced to depth, latitude, longitude and date. These files can be very
large. Many people request it simply because they can. Make sure it is what you
really require before requesting it. There are also some file size restrictions.
See Caveats.
Station Index Lists latitude, longitude and date/time for each profile selected.
The complete result set can consist of up to 5 ascii text
files. All files are of the form qry_xx.qry, where xx is your unique query identifier.
Each file has a format definition at the beginning. The definition for each file and
explanation follows (example).
|