Spark Content
- Data sources (mobile apps, websites, web apps, microservices, IoT devices etc.) are instrumented to collect relevant data --- 对数据源(移动应用程序,网站,Web应用程序,微服务,IoT设备等)进行了检测,以收集相关数据。
- The Data Lake contains all data in its natural/raw form as it was received usually in blobs or files. The Data Warehouse stores cleaned and transformed data along with catalog and schema. 数据湖包含通常以blob或文件形式接收的自然/原始格式的所有数据。 数据仓库存储清理和转换的数据以及目录和模式。
- Intelligent Metadata Catalogs : 智能元数据目录/Data-heavy streaming or live data analysis streaming will gain considerable momentum in 2020 : 数据密集型流或实时数据分析流将在2020年获得可观的发展势头
Blogs
- Predictions 2020: Ushering in 2020 Data Predictions
- Architecture for High-Throughput Low-Latency Big Data Pipeline on Cloud
- The art of joining in Spark
- Spark study notes: core concepts visualized 2018