Advanced Analytics / Data Science

Advanced analytics, often referred to as data science and “Big Data”, covers a broad category of vendors, tools, and technologies that provide advanced capabilities for statistical and predictive modeling. While these tools and technologies often share some overlapping features and functionality with BI tools, they focus less on analyzing/reporting on past data. Instead, they focus more on examining large data sets to discover patterns and uncover useful business information that can be used to predict future trends.

Snowflake works with the following advanced analytic platforms and technologies:

Solution Software Requirements and Additional Information
dplyr

Requirements:

dplyr:0.4.3
Snowflake:dplyr extension, v0.1.1 — available via dplyr-snowflakedb (GitHub)

Additional reading:

Integrating Snowflake with R via dplyr (Snowflake Engineering Blog)
Qubole

Requirements:

Qubole:Enterprise Edition
Snowflake:None - Qubole Data Service (QDS) provides native Spark integration through the Snowflake Connector for Spark

Additional reading:

Qubole Quickstart Guide (Qubole Documentation)
Qubole-Snowflake Integration Guide (Qubole Documentation)
R Language

Requirements:

R:

None

Snowflake:

JDBC Driver — available via snowflake-jdbc (Maven Central) or

ODBC Driver — download from Snowflake web interface

Additional reading:

Apache Spark

Requirements:

Spark:

2.0 or 2.1 (or later)

Scala:

2.10 or 2.11 (or later)

Snowflake:

JDBC Driver — available via snowflake-jdbc (Maven Central) and

Connector for Spark — available via spark-snowflake (Maven Central)

Additional reading: