Spark Content

  1. Data sources (mobile apps, websites, web apps, microservices, IoT devices etc.) are instrumented to collect relevant data --- 对数据源(移动应用程序,网站,Web应用程序,微服务,IoT设备等)进行了检测,以收集相关数据。
  2. The Data Lake contains all data in its natural/raw form as it was received usually in blobs or files. The Data Warehouse stores cleaned and transformed data along with catalog and schema. 数据湖包含通常以blob或文件形式接收的自然/原始格式的所有数据。 数据仓库存储清理和转换的数据以及目录和模式
  3. Intelligent Metadata Catalogs : 智能元数据目录/Data-heavy streaming or live data analysis streaming will gain considerable momentum in 2020 : 数据密集型流或实时数据分析流将在2020年获得可观的发展势头

Blogs

  1. Predictions 2020: Ushering in 2020 Data Predictions
  2. Architecture for High-Throughput Low-Latency Big Data Pipeline on Cloud
  3. The art of joining in Spark
  4. Spark study notes: core concepts visualized 2018

Spark & Kubernetes

  1. How to build Spark from source and deploy it to a Kubernetes cluster in 60 minutes
  2. 大数据面试题

results matching ""

    No results matching ""