What is Analytics
Data analytics is the process of examining large and varied data sets to uncover hidden patterns, correlations, and other insights that can be used to inform business decisions. It involves the use of statistical and computational methods, as well as data visualization tools, to make sense of complex data.
The need for Data Lakes and Data Connectors
Today big organizations are straddled with multiple operational ERP systems, by way of acquisitions or mergers, or other means. So to get the overall picture of the organizational performance and to predict (guesstimate) future performance, a consolidated data lake has to be built and dataZense with its connectors/adapters available for multiple ERP Systems becomes more relevant. With data traffic coming from multiple systems traffic cops become essential (Governing the data coming into the lake – Analytical MDM feature of dataZense). The choice for the data lake has also expanded (Microsoft Synapse, data bricks, Delta Lake, Cloudera, Amazon, Snowflake, Mongo DB, Open TSDB, Google Big Query, and others). Now pumping of data into these lakes (data ingestion) is to be done through specialized receiving ports. dataZense has ready-to-use Ingestion adapters.
Improving Data Quality in the Lakes
Within the lake, all the data brought in from various sources is kept initially in a Bronze layer. The data from Bronze is cleaned and transformed in line with the needs of the Business and moved into the Silver layer. From Silver, analytics-ready data is formed in typically data cubes in the Gold layer. Data de-duping, standardization, and some enrichment also happen between Silver and Gold.
dataZense provides dashboards most commonly required by CEOs, CFOs, COOs, CISOs, Shop Floor Planners, etc. Several algorithms are available in dataZense to analyse the data. The crux is to find the right analysis method for the objective, for which we set out to collect data. Sometimes we might use it to find a best-fit line, curve, or graph to best describe the spread of data collected. Other times we may resort to “Machine Learning” techniques or algorithms to answer futuristic questions or group (classify, cluster) the data. The dashboards and reports can be distributed within the organization.
Now access to the dashboards, reports, and in general the Gold Layer has to be restricted based on the role of the Business User and geographical location within the organization of the Business User. The access controls are well managed from within dataZense. It has the capability to integrate the Analytics System with popular Identity Management Systems such as Oracle’s OIM (Oracle Identity Management) and Okta. Identity Management Systems can provision access information to the Analytics System.
The Gold layer data can be scrambled or masked based on the critical and sensitive nature of the column of data.
When we are done doing something, the next thing is ready for consideration in the fast-paced Information Technology world. People today want to rein in unstructured data (e-mail, PDF, excel, word, social media, etc.). Are you wondering how Excel could be unstructured? Yes, unstructured, if people put multiple tables of data in the same sheet, horizontally and vertically. dataZense crawlers get tabular data from Word, PDF, Excel, and e-mail and get them over to the data lake. OCR technology is put to use, where needed.
So far it has been data, data, and data. What about metadata (Definition, meaning, and interpretation details for data)? We lose metadata information when systems get old and concerned employees to retire (loss of tribal knowledge). We lose metadata information when there is a rapid acquisition of a company and its systems and no proper transfer of information happens (or if many of the acquired entity employees are fired).
Data Cataloging is an effort where an organization goes about meticulously collecting and documenting usable data stores lying in various legacy and other systems and filling in missing metadata. If done manually it is a very tedious and time-consuming process. dataZense offers crawlers to analyze (examine) data in columns and see what their nature is, and document the same in a Catalog. Built-in workflow allows many data experts to collaborate and provide/refine metadata. The Catalog would be a living reference document, which would mean constant updating by the parties responsible for it. The Catalog can be used by business analysts and data scientists to know the availability of data across the organization and use the data in Analytics and Visualization.
The data catalog also profiles intelligently and arrives at possible relationships among the columns considered for metadata enhancements. This allows us to create a first cut Entity Relationship (ER) diagram.
The Data Lineage function of dataZense allows you to track the origin and movement of data throughout the organization. You can see where your data came from, how it has been transformed, and where it has been used. This is especially important for compliance and regulatory purposes.
- Ready connectors/extract adapters from one of the ERPs such as Oracle E-Business Suite, SAP ECC, SAP S/4HANA, Oracle Cloud Applications, Microsoft Dynamics, JDEdwards, Peoplesoft, and others.
- Pre-defined cubes (financial, manufacturing, HR, and other areas) lead to commonly needed dashboards and reports for company executives to make meaningful business decisions.
- Economical Enterprise Data Management (EDM)
- Build and deploy data lakes in platforms of your choice: Microsoft, Amazon, Oracle, Google, Snowflake, databricks, in-house servers, and others
- Data Quality measures