10 Snowflake Concepts You Need To Know (+15 You Ought To)
### INTERFACE
1. **Web UI**: The central interface for managing and querying data.
2. **CLI (Command-line Interface)**: An alternative method to connect to Snowflake using an IDE like Visual Studio Code.
### SESSION VARIABLES
3. **Virtual Warehouses**: The compute engines responsible for performing data processing tasks.
4. **Worksheets**: Interface used to execute SQL queries. Each worksheet can have its own combination of role, virtual warehouse, database, and schema.
5. **Context Functions**: Provide information about various elements like session, account, users, etc., in the context of the executing query.
### OBJECT HIERARCHY
6. **Schemas**: Structures within a database used to group logically related tables and views, forming a part of the namespace along with the database.
a) **Public Schema**: A default schema created with every database.
b) **Information Schema**: Acts as a data dictionary and contains metadata like tables, columns, and their data types.
### TABLES & VIEWS
7. **Permanent Table**: Stores data until explicitly dropped by the user.
8. **Transient Tables**: Offer lower levels of data protection, suitable for less critical data.
9. **Temp Tables (Temporary Tables)**: Exists for the duration of a user session, not visible to other users, and automatically dropped at the end of a session.
10. **External Tables**: Provide a view on top of files stored in a stage, used for reading external data.
11. **Standard View**: A database object that treats the result of a query as a virtual table, allowing it to be used in further queries and joins.
12. **Materialized View**: A type of view where the query results are stored, improving performance by eliminating the need to re-run the query.
### DATA LOADING
13. **File Format**: Specification defining the structure of files to be processed, including aspects like header rows.
14. **Stage**: A logical concept representing an abstraction of a filesystem location, used for data storage and manipulation.
a) **External Stage**: A reference to a cloud storage location, used for managing external data.
b) **Internal Stages**: Area for loading data, typically from on-premises systems, using the CLI.
15. **Snowpipe**: A feature enabling continuous, automated data ingestion, or streaming.
### SEMI-STRUCTURED DATA
16. **VARIANT**: A universal data type in Snowflake capable of storing data of any type.
### DATA PROCESSING
17. **Tasks**: Scheduled execution of SQL statements, used for regular workloads like data transformation or processing.
18. **Conjugate Task**: A method to connect multiple tasks in a parent-child relationship, creating a hierarchy of tasks.
19. **Stream**: Monitors changes (inserts, updates, deletes) in a table, facilitating trigger-based data processing.
### DATA SHARING
22. **Secure Data Sharing**: Share data with other accounts via a soft reference to the tables, without physical data movement.
23. **Reader Account**: An account type for sharing data with non-Snowflake customers, acting as a subsidiary to the main account.
### ROLES
20. **System Roles**: Predefined roles include ACCOUNTADMIN, SYSADMIN, SECURITYADMIN, USERADMIN, and PUBLIC, each with specific privileges and responsibilities.
21. **Custom Roles**: User-defined roles. It's recommended to grant custom roles to the SYSADMIN role for comprehensive administrative privileges.
### PERFORMANCE OPTIMISATION & CONTINUOUS DATA PROTECTION
24. **Clustering**: A technique for data distribution into related storage blocks, involving sorting data into micro-partitions.
25. **Time Travel**: Enables users to access and retrieve historical data at or before a specific timestamp, including changes to data in tables, schemas, and databases.