10 Snowflake Concepts You Need To Know (+15 You Ought To)

### INTERFACE

1. **Web UI**: The central interface for managing and querying data.

2. **CLI (Command-line Interface)**: An alternative method to connect to Snowflake using an IDE like Visual Studio Code.

### SESSION VARIABLES 

3. **Virtual Warehouses**: The compute engines responsible for performing data processing tasks.

4. **Worksheets**: Interface used to execute SQL queries. Each worksheet can have its own combination of role, virtual warehouse, database, and schema.

5. **Context Functions**: Provide information about various elements like session, account, users, etc., in the context of the executing query.

### OBJECT HIERARCHY

6. **Schemas**: Structures within a database used to group logically related tables and views, forming a part of the namespace along with the database.

a) **Public Schema**: A default schema created with every database.

b) **Information Schema**: Acts as a data dictionary and contains metadata like tables, columns, and their data types.

### TABLES & VIEWS

7. **Permanent Table**: Stores data until explicitly dropped by the user.

8. **Transient Tables**: Offer lower levels of data protection, suitable for less critical data.

9. **Temp Tables (Temporary Tables)**: Exists for the duration of a user session, not visible to other users, and automatically dropped at the end of a session.

10. **External Tables**: Provide a view on top of files stored in a stage, used for reading external data.

11. **Standard View**: A database object that treats the result of a query as a virtual table, allowing it to be used in further queries and joins.

12. **Materialized View**: A type of view where the query results are stored, improving performance by eliminating the need to re-run the query.

### DATA LOADING

13. **File Format**: Specification defining the structure of files to be processed, including aspects like header rows.

14. **Stage**: A logical concept representing an abstraction of a filesystem location, used for data storage and manipulation.

a) **External Stage**: A reference to a cloud storage location, used for managing external data.

b) **Internal Stages**: Area for loading data, typically from on-premises systems, using the CLI.

15. **Snowpipe**: A feature enabling continuous, automated data ingestion, or streaming.

### SEMI-STRUCTURED DATA

16. **VARIANT**: A universal data type in Snowflake capable of storing data of any type.

### DATA PROCESSING

17. **Tasks**: Scheduled execution of SQL statements, used for regular workloads like data transformation or processing.

18. **Conjugate Task**: A method to connect multiple tasks in a parent-child relationship, creating a hierarchy of tasks.

19. **Stream**: Monitors changes (inserts, updates, deletes) in a table, facilitating trigger-based data processing.

### DATA SHARING

22. **Secure Data Sharing**: Share data with other accounts via a soft reference to the tables, without physical data movement.

23. **Reader Account**: An account type for sharing data with non-Snowflake customers, acting as a subsidiary to the main account.

### ROLES

20. **System Roles**: Predefined roles include ACCOUNTADMIN, SYSADMIN, SECURITYADMIN, USERADMIN, and PUBLIC, each with specific privileges and responsibilities.

21. **Custom Roles**: User-defined roles. It's recommended to grant custom roles to the SYSADMIN role for comprehensive administrative privileges.

### PERFORMANCE OPTIMISATION & CONTINUOUS DATA PROTECTION

24. **Clustering**: A technique for data distribution into related storage blocks, involving sorting data into micro-partitions.

25. **Time Travel**: Enables users to access and retrieve historical data at or before a specific timestamp, including changes to data in tables, schemas, and databases.

Previous
Previous

Book Summary: “The First Rule of Mastery: Stop Worrying About What People Think of You” by Michael Gervais

Next
Next

Book Summary: "The Great Mental Models" by Shane Parrish