Four tools to help you become a better data analyst

Four Tools to Help you Become a Data Analyst

Gathering and analysing data has changed how business decisions are made and how resources are allocated. Through a better use and understanding of data, business managers can plan and enable better strategies with increased confidence. When understood, the companies data can increase administrative efficiency and add competitive advantage – data analytics provides new insight and actionable intelligence.

Data analytics is the process of examining raw data, understanding the processes underlying that data, discovering the patterns, and drawing conclusions about the information and finally communicating the results in a way that has the biggest possible impact.

So what are some of the tools available that will help you analyse your businesses data and become the data analyst that makes the correct data-driven decisions?

Microsoft Excel

Microsoft Excel is the industry standard for spreadsheets and is the first of the list of tools available for the budding data analyst. Microsoft Excel offers an easily accessible and familiar tool to users to perform financial modelling, put together a quick proof of concept or for that one-time analysis to get a quick conclusion for a new business question.

With the powerful importing tool PowerQuery and the compression capabilities of PowerPivot that extend the capabilities of the normal pivot table data, allow more advanced calculations, and  add the ability to import data from multiple sources. Excel is the perfect first tool for any new data analysts’ tool belt.

SQL Server

Microsoft SQL Server (or other relational database) will likely be powering your business’s transactional ERP system.  As a future high flying data analyst, knowing how to query your source system using SQL queries will quickly allow you to answer some basic questions. Including; “How many sales did we make last month?” or “How many stock items do we have on-hand?”

Writing a small SQL query can provide you with a quick answer to a question and many results can be exported to Excel for further analysis.

Just be warned, executing any SQL query could have unintended consequences. Especially when querying large data sets; the larger the data set, the more complex the query, the longer the wait to get your results.

This powerful tool can be the utility knife on your data analysis tool belt, it is versatile, but it does require some care to use.

Analytics Tools

Analytics tools such as Business Analytics, TimeXtender and Jet Enterprise are purpose built to collect the data from multiple data sources and rapidly combined this data into meaningful groups that you can then analyse. If you need to gather data, do not do it manually yourself, get a tool to help.

Analytics tools give you the ability to answer a specific set of commonly-asked questions such as “How much profit have we made Year-on-Year over the last three years by region?”, or “What was our best selling over the last quarter?” Analytics tools are a big step into the area of Business Intelligence (BI) reporting and data manipulation.

Analytics tools can provide these answers in easy to understand, fast acting, template type, reports that can assist a business in identifying underlying trends or gaps in the business and improving profitability of the business when understood.

Adding one of these analytics tools to your arsenal is like adding a backpack to your utility belt. It can be a slightly larger than originally expected but will provide nearly any answer to any question – provided it was packed correctly beforehand.

Dashboards

All these tools can mean little to business unless they are presented in a way that is meaningful to the end business users. Sometimes this is done in a simple spreadsheet report with baseline numbers, but more likely this requires a little more glamour and shine.

A powerful in-memory BI tool such as Power BI from Microsoft not only offers a powerful data importing and data manipulation tool all in one (with easy to learn features that allows non-technical users the ability to import and understand their own data) Power BI also offers great visualisation options to present the business data in a meaningful, visual manner. These help to keep critical metrics in the line of sight at all times, helping the members of your team to stay focused on the numbers.

Having a dashboard or visualisation tool (particularly a strong BI focused tool) will finish off your utility belt with the required survival kit. This will get you through the harshest of data values and make a big impact fast.

Just a note of caution; No matter what tool you add to your belt or how you utilise it to understand your data – be aware that a tool is only as good as the data you provide it. If you provide incorrect data to any tool you will get incorrect information out, if you provide insufficient data to a tool you will likely end up with the wrong conclusions.

What does this mean for you?

The business world has changed. There is more data available and more being generated in every aspect of business at a more rapid rate than ever before; there are many different methods and tools to analyse this data. Businesses need to understand this data and require data analysts to guide them through the unknown paths that are this new and scary data jungle.

With the above guide, a fully prepared tool belt will assist you in being that data analyst that guides your business through it all.

[Originally Published on the Karabina/IS Partners company blog]

When testing ETL tools here are 5 things to bear in mind

As it is officially braai (barbeque, BBQ) season in South Africa, I decided to compare the elements involved in testing ETL tools with that of a braai. After all, you would not simply light a fire and throw your carefully selected steak onto the flames.

There are several steps you take in the process to ensure that your steak is cooked to your satisfaction. As a data warehouse is not a specific type of technology and there are many types of ETL tools available, proper testing is essential to ensure that your ETL tool is servicing your data warehouse correctly.

So, in the below, we compare the testing of an ETL tool to the testing of your braai skills.

1. When should testing happen and how often?

o ETL: ETL Testing should take place during development of a new system and when any changes occur to that system (source system changes (upgrades) or manual customizations to the reporting solution).

o Braai: Once you have successfully built your braai, and the lit the fire, is important to regularly test the heat of the braai to ensure that you cook your steaks at an optimal temperature. Make the necessary changes to the braai, stoke the coals, add additional fire wood, and change the location of the Weber if necessary.

o ETL: Occasional testing beyond implementation that can usually be accomplished through managed reporting (if you get an error report from the system this is an indication that something is wrong).

o Braai: Occasional testing is necessary now that you have adjusted your fire. High flames are an indication that it is nowhere near ready; take caution around the open flames.

2. Who is responsible for testing?

o ETL: During implementation – the implementation consultant/developer.

o Braai: During initiation of the braai / fire – the braai master (developer who is usually pedantic about the fire they have lit) is responsible for testing.

o ETL: During Customizations to the report tool – the implementation consultant/developer performs these changes.

o Braai: On occasion the braai master may send a braai assistant (consultant) to check or adjust the braai.

o ETL: During daily automated runs – System administrator (SQL Administrator).

o Braai: Advanced braai masters (developers) have automated tools such as a braai thermometer that will sound an alarm when the optimal temperature is reached reducing manual intervention.

3. What are the benefits of testing ETL?

o ETL: Faults are eliminated early – before the reports get to Business Users.

o Braai: Faults are eliminated early – having a braai with sub-standard charcoal or braai wood will affect the end result.

o ETL: The system ends up being more stable and reliable

o Braai: Your guests at your braai can trust that they will have a phenomenally cooked steak.

o ETL: Users can trust the system (proof can be shown that it balances).

4. Types of ETL Tests

o Constraint Testing (Developer checks the following constraints (NOT NULL, UNIQUE, Primary Key, Foreign Key, Default, & NULL)

o Source to Target Count and Data Validation

o Duplicate Checking

o Incremental and/or Historical load processes

o Regression Testing (Does a fix on one field break any other fields)

5. Phases of ETL testing:

o Requirements/Design Review

o Test Planning (What tests to perform)

o Test Designing (Build the tests)

o Test Environment setup

o Test Execution

o Test Reporting (results of the tests)

Most braai masters will tell you that there is a very specific method that each one of them uses to create the best braai. The same is relevant to ETL developers and consultants when it comes to testing ETL Tools on the data warehouse.

Our approach is to use data warehouse automation tools such as TimeXtender​, as they reduce the dependency on senior BI technical skills as well as reducing development time by an estimated 80%.

Image credit: rapgenius

[Originally appeared on the Karabina/IS Partners company blog 18 September 2014]

Quick Windows Server 2008 R2 – SQL Server Firewall setup

Setting up SQL Server 2012 I received the same warning I have always received when setting up a SQL instance on a new Windows server:

SQL Setup Support Rules – Firewall Warning

So I executed my handy Windows Server Firewall batch file for SQL to set up the default ports on the Firewall, which executed successfully, but I noticed the following warning during execution:

Warning – netsh firewall is deprecated

After a quick check I updated the file with the new correct script options (I have left the original script lines in, just commented them out for reference:

@echo =========  SSRS Ports  ===================
@echo Enabling SQLServer default instance port 1433
REM Deprecated: netsh firewall set portopening TCP 1433 "SQLServer" 
netsh advfirewall firewall add rule name="SQLServer" dir=in action=allow protocol=TCP localport=1433

@echo Enabling Dedicated Admin Connection port 1434
REM Deprecated: netsh firewall set portopening TCP 1434 "SQL Admin Connection" 
netsh advfirewall firewall add rule name="SQL Admin Connection" dir=in action=allow protocol=TCP localport=1434

@echo Enabling conventional SQL Server Service Broker port 4022  
REM Deprecated: netsh firewall set portopening TCP 4022 "SQL Service Broker" 
netsh advfirewall firewall add rule name="SQL Service Broker" dir=in action=allow protocol=TCP localport=4022

@echo Enabling Transact-SQL Debugger/RPC port 135 
REM Deprecated: netsh firewall set portopening TCP 135 "SQL Debugger/RPC" 
netsh advfirewall firewall add rule name="SQL Debugger/RPC" dir=in action=allow protocol=TCP localport=135

@echo =========  SSAS Ports  ==============
@echo Enabling SSAS Default Instance port 2383
REM Deprecated: netsh firewall set portopening TCP 2383 "Analysis Services" 
netsh advfirewall firewall add rule name="Analysis Services" dir=in action=allow protocol=TCP localport=2383

@echo Enabling SQL Server Browser Service port 2382
REM Deprecated: netsh firewall set portopening TCP 2382 "SQL Browser" 
netsh advfirewall firewall add rule name="SQL Browser" dir=in action=allow protocol=TCP localport=2382

@echo =========  Misc Applications  ==============
@echo Enabling HTTP port 80 
REM Deprecated: netsh firewall set portopening TCP 80 "HTTP" 
netsh advfirewall firewall add rule name="HTTP" dir=in action=allow protocol=TCP localport=80

@echo Enabling SSL port 443
REM Deprecated: netsh firewall set portopening TCP 443 "SSL" 
netsh advfirewall firewall add rule name="SSL" dir=in action=allow protocol=TCP localport=443

@echo Enabling port for SQL Server Browser Service 'Browse' Button
REM Deprecated: netsh firewall set portopening UDP 1434 "SQL Browser" 
netsh advfirewall firewall add rule name="SQL Browse" dir=in action=allow protocol=TCP localport=1434

@echo Allowing multicast broadcast response on UDP (Browser Service Enumerations OK)
REM Deprecated:netsh firewall set multicastbroadcastresponse ENABLE
netsh advfirewall set currentprofile settings unicastresponsetomulticast enable

@echo Complete, Check your results.
pause

Hope it make your future installs easier.

Tested on:

  • Windows Server 2008 R2 Sp1 while installing SQL Server 2012
  • Windows Server 2008 R2 while installing SQL Server 2008 R2

Sources:

http://technet.microsoft.com/en-us/library/cc771046(v=ws.10).aspx

http://support.microsoft.com/kb/947709

http://support.microsoft.com/kb/949543

Note: This script is provided “as is” without any representations or warranties, express or implied – use of this script is at your own risk.