Data Analyst: The Top 3 Skills

The Top 3 Tech Skills for a Data Analyst

Welcome everyone šŸ‘

What’s inside this newsletter?

  • Definition: Data Analyst

  • Most Important Language: SQL

  • The most important tool: PowerBI

  • The most Important Cloud: Azure

Before we dive in we need to define the role of a DATA ANALYST. There are a ton of myths and misconceptions about what this role is and what it is not. This role was defined decades ago. Let’s use a definition from one of the companies who created the role, Microsoft.

A data analyst enables businesses to maximize the value of their data assets through visualization and reporting tools. They're also responsible for profiling, cleaning, and transforming data.

🧨 WARNING: This is NOT the same type of data cleansing that happens in machine learning. The name is identical, the process and end result is not. When you cleanse data in machine learning, you’re taking all your data and converting it to numbers. When you clean data for the data analyst role, you are removing things that will detract from readability of that data. 

The responsibilities of a data analyst also include designing and building scalable and effective data models, and enabling and implementing the advanced analytics capabilities into reports for analysis.

ā

A data analyst enables businesses to maximize the value of their data assets through visualization and reporting tools.

It’s important to point out that within the above definition, there is NO Python nor is there any machine learning modeling. A data analyst is NOT a machine learning engineer or a data engineer. A data analyst is the ONLY entry level job within the data space.

🧨 WARNING: There is no such thing as a BIG data analyst. Data analysts don’t decide nor do they work with big data. Data analysts simply access the data they need for dashboards and reports. The companies data stewards are the data engineers and the DBAs. They design how the data is stored and managed. When the data analyst needs access to ā€œbig data,ā€ the data engineer will provide every facet specific to access, where the data is stored, how the data is stored… etc.

Ok, now that we have a clear definition of the data analyst role, what are the top technical skills for the data analyst?

NUMBER 1: SQL šŸ„‡

Every technology role has primary skills and every role has tertiary skills. The top skill or langauge for almost every data role is SQL. Here’s a list of roles in the real-world where SQL is the top skill or one of the top ones.

  • Data Analyst

  • DBA

  • Data Engineer

  • Machine Learning Engineer

  • Faker Scientist šŸ˜‚

  • ETL Developer

What is SQL? SQL is a computer language. It is the language of data. SQL is an acronym that stands for Structured Query Language. SQL is used with almost every single data store in existence. The top two data stores in the real-world are relational databases and data warehouses. 

Relational databases define relationships between different tables, which are structured as rows and columns, using concepts like primary and foreign keys to establish connections between data stored in various tables. Let’s simply this before we delve into primal keys, foreign keys, et.

ā

A database is a container for objects. The most important object is the table. In relational databases, tables are connected by objects called keys.

A data warehouse is a type of data management system that is designed to enable and support business intelligence (BI) activities, especially analytics. Data warehouses are solely intended to perform queries and analysis and often contain large amounts of historical data. The data within a data warehouse is usually derived from a wide range of sources. Many data warehouses are populated with data from relational databases.

ā

A data warehouse is a repository of highly structured historical data which has been processed for a defined purpose.

How does a data analyst use SQL? Almost all of the companies financials, marketing and accounting data live in relational databases or data warehouses. As a data analyst, you’ll need to author the queries to extract only the data you need for your KPIs and dashboards. There are tools to help you but those tools won’t get you past the SQL interview. Learn SQL now. It will act as the building block for any other data career moving forward.

NUMBER 2: PowerBI 🄈

The number two tool for the data analyst inside the Microsoft ecosystem is PowerBi.

What exactly is PowerBI? Power BI is a collection of software services, apps, and connectors that work together to turn your unrelated sources of data into coherent, visually immersive, and interactive insights. Your data might be an Excel spreadsheet, or a collection of cloud-based and on-premises hybrid data warehouses. Power BI lets you easily connect to your data sources, visualize and discover what's important, and share that with anyone or everyone you want.

ā„¹ļø ECOSYSTEM:  A technology ecosystem is the collection of tech solutions a company uses to run its business, and how these solutions connect with each other. For example, let’s say you work on a windows laptop and use Azure and Fabric. You also use SQL Server. All these are part of the Microsoft ecosystem for data professionals.

You’ll spend most of your time working inside PowerBI. Wait. If this is where I’ll spend most of my time, why is it number two? Because the interface is worthless without data. If you can’t source the underlying data, housed in relational databases or data warehouses, then the interface is useless. I.

On the dashboard below there are graphs, charts and KPIs. The graphs are charts are fairly straightforward. However, what’s a KPI? It’s an acronym for key performance indicator. A KPI is a quantifiable measure. Total volume of something is an example of a KPI. Total market share is another example of a KPI. All the numbers on the left hand side of the dashboard below are KPIs.

PowerBI Dashboard

NUMBER 3: Cloud Skills šŸ„‰

Approximately 94% of all companies have some of their data in the cloud. Around 60% of all the world’s data lives in a cloud. In today’s environment, cloud skills aren’t optional, they are mandatory. 

ā

Over 90% of companies worldwide already use cloud services—including AWS, Google, and Microsoft. The number of cloud computing users will grow significantly in the coming years.

I’ve worked with two cloud services on several projects. I’ve spent most of my time with AWS and had two short contracts with Azure. I’m speaking strictly from a data perspective.

NOTE: In the context of Amazon Web Services (AWS), a service refers to a specific, cloud-based product or offering that provides a particular functionality, such as compute, storage, database, or networking, delivered on demand and accessed via the internet. For example, AWS has RDS or relational database service.

ā

Azure is superior to AWS in almost every facet when it comes to working with data.

Mike

I worked with a tool on AWS called data migration service. šŸ’© It was easily the worst service I’ve ever worked with in a real-world setting. There were multiple days on that contract I was sitting on a support call with AWS trying to get it fixed. I’m not the only one who had issues either. Here’s a reddit post detailing the problems this poster had. I choose this post because it most aligns with how I feel about AWS in general. This list is long.

The good news for you is… you’ve chosen the world’s best cloud… Azure. I recently completed a course on Microsoft’s new data centric approach to data, called Fabric. It is light years away from anything AWS has or is working on. Here’s the curriculum for the Fabric course if you want to get an idea what it looks like.

Fabric is a part of Azure. Basic cloud knowledge isn’t hard to attain and it’s very interesting. PowerBI is also a part of Fabric. If you’re interested in a short overview of Azure’s data architecture, check out this walkthrough created by Microsoft. Warning, this walkthrough is aimed at data engineers but it will give you a fantastic high-level look at working with data on Azure.

We’ve defined the data analyst and we’ve listed the three most important skills every data analyst will need to know. There are others. However, when you’re starting out it’s easy to get overwhelmed with language and tools. When you’re new, stay focused on the core skills until you have a solid handle on them. After a few months with each, branch out to other skills companies require for this role.

We’ve reached the end of this one. šŸŽ‰

Let’s summarize the highlights. 

  • The data analyst is an entry level role.

  • A data analyst creates dashboards and KPIs.

  • Data cleansing for the data analyst and machine learning engineer are not the same thing.

  • The two most prevalent data stores in the real-world are the relational database and the data warehouse.

  • Relational databases store transactional data.

  • Data warehouses store historical data, often for reporting.

  • The top language for any data role is SQL.

  • The top skill for the data analyst is PowerBI.

  • The top cloud for the data analyst is Azure. (Mike’s Opinion)

Thank you for reading and stay focused on your goals. šŸ¤ŸšŸ»