Sample Data: OpenWeatherMap¶
OpenWeatherMap is a repository of historical and intraday-updated current and forecasted weather data. Snowflake provides this weather data free of charge so you can try out our unique, highly-performance semi-structured columnarization functionality.
In this Topic:
Tables¶
The data set includes six main tables:
- WEATHER_14_TOTAL: Current weather of 20,000 cities updated hourly. The data is stored in a native JSON format described here and it accumulates history over time.
- HOURLY_16_TOTAL: Four days of hourly weather forecasts for over 200,000 cities updated daily. The data is stored in a native JSON format described here and it accumulates history over time.
- HOURLY_14_TOTAL: Four days of hourly weather forecasts for over 20,000 cities updated four times a day. The data is stored in a native JSON format described here and it accumulates history over time.
- DAILY_16_TOTAL: Twelve days of daily weather forecasts for our 200,000 cities updated daily. The data is stored in a native JSON format described here and it accumulates history over time.
- DAILY_14_TOTAL: Twelve days of daily weather forecasts for over 20,000 cities updated four times a day. The data is stored in a native JSON format described here and it accumulates history over time.
Query Examples¶
The following query retrieves the recent high and low temperature readings for New York City, converted from celsius to fahrenheit temperatures, along with the latitude and longitude for the readings:
select (V:main.temp_max - 273.15) * 1.8000 + 32.00 as temp_max_far, (V:main.temp_min - 273.15) * 1.8000 + 32.00 as temp_min_far, cast(V:time as TIMESTAMP) time, V:city.coord.lat lat, V:city.coord.lon lon, V from snowflake_sample_data.weather.WEATHER_14_TOTAL where v:city.name = 'New York' and v:city.country = 'US' order by time desc limit 10;
The following query compares weather forecasts to actual weather readings:
with forecast as (select ow.V:time as prediction_dt, ow.V:city.name as city, ow.V:city.country as country, cast(f.value:dt as timestamp) as forecast_dt, f.value:temp.max as forecast_max_k, f.value:temp.min as forecast_min_k, f.value as forecast from snowflake_sample_data.weather.daily_16_total ow, lateral FLATTEN(input => V, path => 'data') f), actual as (select V:main.temp_max as temp_max_k, V:main.temp_min as temp_min_k, cast(V:time as timestamp) as time_dt, V:city.name as city, V:city.country as country from snowflake_sample_data.weather.WEATHER_14_TOTAL) select cast(forecast.prediction_dt as timestamp) prediction_dt, forecast.forecast_dt, forecast.forecast_max_k, forecast.forecast_min_k, actual.temp_max_k, actual.temp_min_k from actual left join forecast on actual.city = forecast.city and actual.country = forecast.country and date_trunc(day, actual.time_dt) = date_trunc(day, forecast.forecast_dt) where actual.city = 'New York' and actual.country = 'US' order by forecast_dt desc, prediction_dt desc;