Career As A Data Analyst
Today there is a demand for data analysts in the market, even more than that of programmers and software developers. The demand for data analysts is very high all across the United States.
Qualifications
To be an ideal candidate for the job of data analyst, you need to have knowledge and experience in analyzing the market environment, and should be a graduate in any field related to statistics. You also need to be well versed in Excel, SAS, SPSS, Swindon or Gloucester.
Skills
Data analysis requires intensive research work. A data analyst should also have strong problem solving skills. The main skill required by a data analyst is to extract information related to a topic from raw data. Therefore, you should possess abilities with reference to data mining, data mapping and data warehousing. Data tracking and identifying the trends and patterns in the marketplace are very important tools in the hands of a data analyst.
Statistical and mathematical knowledge is also required. You also need to have knowledge in specialized data management software, and should be able to analyze data statistically, and have the ability to generate reports, and audit and validate them. This will help to analyze the data accurately, which in turn will produce the report according to client requirements.
Other skills required are verbal and writing skills, along with data interpretation and presentation skills.
What Does A Data Analyst Do?
A data analyst is a person who has to search for information related to the particular requirements of a client. Therefore, you must have the ability to question yourself in relation to the content of a topic. You then have to keep researching until an appropriate solution is found. You have to discover the source of the original data and be able to evaluate it.
You should be able to compare data statistically and provide appropriate solutions. Therefore, as a data analyst you have to refer to a large number of data sources and work on your reporting skills, so that you can present it in a simple and effective manner. You also have to audit the report, and the report has to be presentable and appealing to the client.
Job Titles
Junior Data Analyst: A junior analyst searches for the appropriate data and provide sufficient material to be analyzed. They have to prepare statistical diagrams and flowcharts.
Senior Data Analyst: A senior analyst has to consult and communicate with the client. He/she has to prepare the final report and present it.
Data Analysis Project Manager: Duties include project organization and methodology. The project manager has to audit the reports and make sure that they fulfill the needs of the client
Since all these are specialized skills, which are in great demand in today’s technological world, data analysts have high growth prospects. If you have the required qualifications and skills, and are keenly interested in pursuing a career in this field, then being a data analyst is a good option for you.
Dimensional Data Modeling For Data Warehouses
Dimensional data model is the most common design concept used by data warehouse designers to build data warehousing systems. The data model design is the underlying data model used by many of the commercial OLAP products available today in the market. Some of the terms commonly used in this type of modeling are: Dimension- a category of information (e.g. The time dimension); Attribute- a unique level within a dimension (e.g. Month is an attribute in the time dimension); and Hierarchy- the specification of levels that represents relationship between different attributes within a dimension (e.g. Year → Quarter → Month → Day).
Dimensional data model contains two types of tables. They are:
Fact Table: Fact table in a dimensional data model contains the measures of all interest, such measurements or metrics or facts of business processes. Take the example of the sales amount of a business. The amount can be a monthly sales number or sales number for a day. This measure is stored in the fact table with the appropriate granularity. For sales measures, a fact table generally contains three columns: a date column, a store column and a sales amount column. Besides the measurements the table will also contain foreign keys for the dimension tables.
Dimension Table: The dimension table in a dimensional model represents the context of the measurements. The context of measurements can also be understood as the characteristics such as who, what, where, when, how of a measurement (subject). For example, in a business process Sales, the characteristics of the ‘monthly sales number’ measurement would be a Location (Where), Time (When) and Product Sold (What). A dimension table contains a number of dimension attributes or columns. In the Location dimension the various attributes can be Location Code, State, Country, Zip code. Further, dimension attributes contain one or more hierarchical relationships.
If you are looking forward to building a data warehouse for your organization, you should first decide what your data warehouse will contain. Depending upon your organizational goals, you can choose the type of dimensions that can best meet your requirements. For example, if you want to build a data warehouse that would contain monthly sales numbers across multiple store locations, across time and across products then your dimensions would be Location, Time and Product.
In designing data models for data warehouses or data marts the most commonly used schema types are Star Schema and Snowflake Schema.
Star Schema: In this type of schema design, a single object or the fact table is placed in the middle and is radially connected to other surrounding objects or dimension tables like a star. Here, each dimension is represented as a single table and the primary key in each dimension table is related to a foreign key in the fact table. A simple start schema consists of one fact table and a complex star schema may contain more than one fact table.
Snowflake Schema: This type of schema design can be called as an extension of the star schema. In this design each point of the star or each dimension table contains more points. In other words, in a star schema each dimension is represented by a single dimensional table, while in a snowflake schema that dimensional table is normalized into multiple lookup tables, each representing a level in the dimensional hierarchy.
Choosing a particular type of schema design depends on personal preference as well as business needs. So, it is up to you which one you choose among the two for your data warehouse project.
Proper Data Security And Storage Methods
The PCI DSS (Payment Card Industry Data Security Standard) requires that any merchant who accepts, processes, stores, transmits sensitive credit card information must do everything possible to protect and guard that data. Proper data security and storage, however, can be a difficult thing to do in-house.
Data security and storage comprise a major portion of the PCI DSS and is also a necessary part of maintaining trust with your customers. In an age where personal information is a valuable commodity, customers need to know that their transactions are secure and you have a priority on guarding their personal data.
The third requirement of the PCI DSS states simply: “Protect stored cardholder data.” This may be a simple thing to say, but that doesn’t necessarily make it an easy thing to implement, nor does it downplay the importance. There are quite a few individual security controls that are required before you can say that you have created the proper data security and storage environment.
The first step is encryption. If you must store sensitive information on your own system you must encrypt it. This is a basic step because if a criminal intruder should happen to bypass all the other security measures that are in place, all they will find on your system are strings of random gibberish that are useless without the encryption key.
The next step is to limit the amount of cardholder data on your system. This includes only keeping the data that is absolutely necessary for legal, business, or regulatory purposes. When you don’t need it anymore, get rid of it. The less you have that is worth stealing, the less of a target you become. There are also a few things you’re not allowed to store at all. These include the full contents of any track from the magnetic stripe (like the card verification code or PIN verification value), or the three or four digit validation codes or personal identification numbers.
Of course, even if you’ve taken the steps to electronically protect data by encrypting it, there’s still the possibility that someone inside the company could steal or wrongfully employ the encryption keys. For that reason, the third requirement of the PCI DSS also mandates protecting those keys against misuse and disclosure.
Access to these keys must be restricted to the fewest number of people possible. These keys must also be stored in as few places as possible. Backups are, of course, necessary, but if you end up backing it up in too many places, you’re likely to forget where they all are, or accidentally place one where someone with criminal intentions can get a hold of it.
Requirement numbers seven, eight, and nine also deal with limiting physical access to cardholder data. These mandate that you restrict access to this data by to business need-to-know, and that you assign unique IDs to each person with computer access. These are measures that help ensure that you can trace the source of your problem, should a breach occur.
There is another option for proper data security and storage that simplifies all these security controls. Simply don’t store any data on your own system. Remote storage is becoming a very popular option for merchants who are worried about attacks on their system and possible security breaches.
The only way to ensure that your data security measures are effective is through constant monitoring and management. The unfortunate truth of the matter, though, is that most merchants simply don’t have the time or resources to efficiently and actively control the security on their systems.
But there are companies out there now who specialize in providing effective data security and storage. Remote storage on these systems is one of the best ways to protect sensitive data and take some major steps toward becoming PCI compliant.
Above all, remember that these steps are about more than simple compliance. As consumers grow more weary about who they give their information to, it will be more and more important to guarantee the safety of their personal data.