Data in the age of big data
Hello, my name is Enhanced. What I'm going to share with you today is the value of data.
So what exactly is a data platform? I guess you've heard or thought about the concept of data mid-table, and in order to align the concept, let's summarize the core capabilities of data-based platform.
The source of the data mid-table
Before we understand the data, let's look at the source of the concept of central Taiwan.
In mid-2015, Mr. Ma led Ali's executives on a visit to Supercell, a Finnish-based mobile gaming company known for developing games such as Tribal Warfare, Island Odd soldiers and Cartoon Farm.
To Jack Ma's surprise, the company, which makes an annual pre-tax profit of $1.5 billion, has fewer than 200 employees. They are decentralized and require no more than seven employees per team. The team can decide for themselves what products to develop and launch the public test version as quickly as possible. If the user is not welcome, quickly give up and look for a new direction.
The foundation that underpins this efficient dispersal mode of warfare is Supercell's six-year-old precipitated in-game platform. By organizing common and common game footage and algorithms from the game development process, it can support several small teams to develop a new game in a few weeks and encourage employees to try and error.
This re-used in-game design platform concept plays a vital role in improving team effectiveness, precipitateing the enterprise's content assets and enabling the business to land quickly. Six months later, Ali followed Supercell's lead and launched the China-Taiwan strategy, which also set off a domestic china-Taiwan strategy boom.
The evolution of Taiwan in Alibaba Data
Since the data of domestic enterprises in Taiwan strategy is Alibaba with fire, then how did Alibaba build data in Taiwan?
First Taobao C2C business unit, with the arrival of B2C business, and set up Tmall business department, Taobao's technical team because of the support of Tmall and Taobao, so the group decided to set up a shared business division, but after all, is a technical services department, certainly not those two business units in a high position (after all, money is the boss), so the sharing business department in Taobao and Tmall sandwich survival.
In 2010, the success of the gathering, a cost-effective online display of a strong traffic attraction power, Taobao and Tmall both want to meet the cost-effective, and later 1688 also participated in the cost-effective, the group decided that all and the cost-effective docking platform, all have to go through the shared business unit, so that the sharing business department has a strong grasp, the original and the three e-commerce business dialogue rights imbalance to a relatively fair level.
Alibaba's different data platforms at different stages of the enterprise's data, from data dispersed to intensive storage, processing, computing, external services process. Let's take a look at Alibaba's current data mid-table bar in OneData, OneEntity, and OneServie, which is based on the principle of shared re-use, while driving organizational change.
A bottom-up data center
System data center: data statistics by business, natural persons, statistical indicators as entities
Public data centers: built on business, business processes, analytical dimensions, details/summary data
Vertical data center: data integration by collecting (burying, crawling, purchasing, business generation data), cleaning, synthing, synchronization, etc.
Ali's unified data center middleware is divided into extraction data center, public data center and vertical data center three parts, vertical data center is responsible for collecting data from Ali's various business units, public data center similar data warehouse, all data by different subject areas (e-commerce, entertainment, marketing, logistics, finance, etc.) classification management, extraction data center is responsible for according to business needs, the theme domain data processing, the establishment of consumers, enterprises, content, commodities, location five data systems. The purpose of Ali Data Station is to provide internal data infrastructure and unified data services, and to provide unified data products to foreign service providers.
There is no summary of the drawbacks of the data
1, repeated investment and construction costs are high
The three e-commerce architecture systems are completely isolated, and each application is opposed to development and operation, which results in repeated investment brought about by the repeated construction and maintenance of functions. From the perspective of cost input from both development and operation, it is a very obvious cost and waste of resources for enterprises.
2, data island
The business systems are independent of each other and difficult to get through. With the development of enterprise business, it is necessary to open the connection between these chimney systems to improve the operational efficiency of enterprises, many brands in 2008 after the emergence of Tmall, sub-system docking, and then do the system docking JD.com and micro-business, while the enterprise also has thousands of stores and distributors need to manage, so we have to establish the corresponding POS, crm system. In 2013, e-commerce had a huge impact on the distribution model of traditional retailers, who were anxious to obtain the user's final consumer behavior and hobby information, so as to support the user's precise marketing, but found that the user's membership information, product information, order information are scattered in various chimney systems, so they had to open these chimneys, so as to obtain the global membership and consumption data required by the brand. This has resulted in a huge waste of human and material resources.
3, assets can not be precipitated
The traditional idea of re-making wheels is not conducive to the precipitation of data assets
(1) If we do activities every time to rededuce a set of pages and systems, if the business often do activities, it is undoubtedly very high development costs, this time we can do an activity template to reduce development costs, so that the wheels can be reused.
(2) In the perspective of data, indicators, portraits, algorithms, maps and other models are based on historical data processing of various types of data models, these models are spending a lot of manpower and technology iteration, model tuning and polishing out, big data era, these models for enterprises are intangible with very high value of the core assets, switching system data base should have the ability to reus. It is also an effective way of thinking on the big data era to reduce the dimensional impact of friends and businessmen.
The concept of Taiwanization in data assets
In the above cases we see the importance of data mid-table, so how to define data mid-table?
(1) Standing at the technology platform level: A enterprises have built a big data platform, the big data computing layer accelerates the efficiency of processing data, may no longer need to do data layering can quickly obtain the required data model. Is this Zhongtai?
If the X, Y, Z departments have such a data calculation scenario, the construction of a set of big data platform reusability rather than separate, not only reduce labor costs, but also faster efficiency, this is the construction thinking of The Central and Taiwan.
(2) Standing at the tool layer: B enterprises want to do a lot of reports, customized development cycle is long, inefficient, slow decision-making, so buy a BI tool, delivery efficiency increased by 10 times; Is this the data mid-table?
We see many companies that sell data-related tools, providing data governance, data processing, BI analysis tools, and other related products to help them quickly realize their reporting and Dashboard capabilities. This is also the construction thinking of China-Taiwanization.
(3) Standing at the data service level: D financial enterprises in order to serve the financial business, wind control capacity and financial credit business default rate is positively related, in order to improve the wind control capacity, we need multi-dimensional wind control data services, such as: socialized master data services (such as enterprise survey enterprise portrait service enterprise credit), user behavior analysis (building user portrait), portrait-based marketing services, relationship map services (anti-fraud), telecommunications operators blacklist number services, default prediction algorithm model services and so on. For example, the portrait label service needs to be continuously precipitated according to the application scenario, such as talkingData in the big data circle, aurora big data, push big data and so on each enterprise has accumulated hundreds of marketing tags to the user entity, is the process of precipitation for many years, not overnight can be built. In the later sharing, I'll share how assets land in various scenarios.
In a word: data assets in Taiwan referred to as data platform, data assets include technology platform layer, tool layer, scene-driven data services, data assets in Taiwan data platform is a solution,Solve the problem of the speed of application of data from the raw data layer to the business layer.
The value driver for the data platform
Standing at the data services layer, we mention the various types of data services, today is the big data era, how do we build different types of data services capabilities? First, let's look at data consumption thinking in the age of big data.
From sample thinking to general thinking
From precise thinking to fault-plascing thinking
From causal thinking to related thinking
Let's explain from the technical perspective, the data perspective and the application perspective.
Technical perspective: data characteristics and data consumption thinking
In the era of big data, multi-source heterogeneous data, business system structured data, reptile crawled semi-structured data or user behavior data, unstructured data such as sound, video, etc. are playing an increasingly important role in different scenarios. Let's take a look at the characteristics of data in the age of big data:
1. Mass: Volume
Small data to big data, from GB to TB to PB, EB, to build a system to accelerate the efficiency of data consumption is the premise of good business data, but also the vitality of enterprise data empowerment intelligent scenario, so the construction of data in the big data era is no longer the construction model of traditional data warehouse.
2. Diversity: Variety
A wide variety of data sources, data formats, and data structures, such as business data, labels, logs, images, voice, video, and so on. Different data play an increasingly important role in different scenarios, the algorithm model needs to use a large amount of historical data for training, so the different formats of data and algorithm precipitation assets have risen to the asset capacity level of The Central Taiwan. As a result, the data asset layer is no longer a single data for traditional business systems. This has contributed to the generation of data lakes.
3. High Speed (Velocity)
The analysis and business application of new data are almost real-time, so the integrated data processing power of batch flow becomes the problem that needs to be solved by the technology layer in the data.
4. Low Value Density (Value)
The value density is inversely inversely related to the size of the total amount of data. For example, in the case-breaking surveillance video, we only care about the suspect's time, how to quickly "purification" of valuable data has become the current big data background to be solved. Of course, the algorithm model using neural network training has gradually solved these problems.
Data perspective: from business data to data assetization
Application data in the information age: electronic records of business in the relationship database, such as ERP, CRM, SCM, etc., the data is isolated.
Application Data in the Age of Big Data: Data Assets Built by Different Data Models
Data assetization: data empowerment business, intelligent data applications to optimize business scenarios, such as analytics databases, business intelligence, precision advertising, search, recommendation, wind control, perception and cognitive computing.
With intelligent scenarios, today's business systems are being reconstructed, especially the enterprise Internet, the Internet of Things, industrial Internet links upstream and downstream can get more data. To give you an example: before ERP focused on management, now ERP can link social resources, that is, the original production system (ERP), supply chain (SCM), marketing system (CRM) is separate, now the data is integrated, ERP can receive orders from JD.com, Taobao, etc., to sell fixed production, use sales data to predict production data, with production data and sales data to optimize supply chain management, data is open, no longer isolated.
Application perspective: from auxiliary decision-making to data-driven
Application mode of data in non-mid-table mode: Data-Assisted Decision Making
Non-data platform data application mode, some people to drive analysis, some people to apply analysis results, data mainly enabled business decision-making, often reflected in business intelligence, visual reports, board, large screen, leader cockpit and so on.
Data Application Mode in Data: Data-Driven Application Development
Create value by supporting intelligent business applications through online data services or model services, or even providing endpoint data products directly
Enabling innovative products, such as: micro-credit using data automation to judge the credit automation of individuals or enterprises to decide whether to lend, intelligent investment using knowledge map automation recommendations, intelligent marketing automation accurate matching. Intelligent business scenarios require a large amount of historical data, multi-source heterogeneous data to participate in processing into different types of data models to form data assets, and then I will tell you about the construction of different models.
Summary of the value of the platform in the data
Using the value drivers in the above data, we use examples to summarize the characteristics of The Taiwan precipitation assets in the data:
Innovation
In Internet enterprises, many do data burying point to collect user behavior data, based on buried data to do user growth, product optimization iteration, strategy operation, precision marketing, let's look at the user behavior data application of the overall scenario.
From a technical point of view, we collect user behavior log data is usually PC or mobile device unique identification, data occurrence time, location and other information, these data can not be used directly by the business, need to be organized for business personnel-oriented data, understandable data assets or directly invoked data services, then what is understandable data? Here's what it looks like:
Scene-driven
Demand-driven has a certain lag, the passive receiving state of the information department.
Scenario-driven turns passive into active, and when problems occur, respond quickly based on the accumulation of data asset systems and service systems.
Scene-driven is also a best practice for quickly validating mid-table capabilities when landing data.
Experience precipitation
Through the understanding of the scenario, the business side quickly selects the required data capabilities to verify the value of the data, validate the process and results of the data, as an important part of the data asset precipitation, feedback to the data in the middle, as a key capability for continuous optimization.
Efficiency reduction
1. Efficiency issues: the continuous precipitation of assets drives business growth, while providing the ability of new business to quickly test and error, determine the new business direction, business operations are faced with many uncertainties: from policy adjustment, market competition, demand changes, technological evolution and other aspects of macro-environmental factors, but also from research and development design, manufacturing, product services, industrial chain collaboration and other value chain links of micro-environmental factors, as well as from employees, partners, Stakeholder factors for multiple entities, such as competitors and users. These uncertainties are intertwined, so that enterprise managers are facing market demand diversification, product service value-added, production process complexity, industrial collaboration multi-dimensional and other unprecedented tests.
2. Collaboration issues: unified standards of data, improve docking efficiency with the business or front end, avoid duplication of development;
Capability issues: For data maintenance, data development, so that the operation becomes more concise, efficient, safe and stable, reduce costs.
3. Be proactive: the world has entered the data era, the core competitiveness of enterprises is data, whether it is data platform or data platform, rapid decision-making, empowerment of business, guide business growth behind the use of data value.
Your doubts will be answered
After reading this article I think you may have a lot of confusion, data in Taiwan is so big ah, only large enterprises can build? What is the construction plan? What's the difference with a data warehouse? How should our current business land? What do I need to land on? In the later share I will answer for you one by one.
If you want to experience this course, you can scan the following QR codes to learn:
Send to the author