Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analytics-zoo #941

Open
junxnone opened this issue Mar 5, 2021 · 0 comments
Open

analytics-zoo #941

junxnone opened this issue Mar 5, 2021 · 0 comments

Comments

@junxnone
Copy link
Owner

junxnone commented Mar 5, 2021

Reference

Brief

  • Distributed TensorFlow/Keras/Pytorch Training and Inference on Apache Spark
  • Scale AI model to large clusters
  • Deploy AI pipeline to clusters
  • AutoML(support for time series)

arch

Orca

  • Scaling out single node Python AI pipeline across large clusters
  • Support Framework - Sklearn/Tensorflow/Pytorch/Keras/MXNet/Horovod
  • Support Input - Pandas/Numpy/PIL/Tensoflow Dataset/Pytorch DataLoader

@20210322

  • tf1.x 支持不好,不支持 callback/tensorboard, 对 loss/metrics 支持不好
  • tf2.x 支持 callback/tensorboard,基本无痛移植

Term

  • BigDL : 分布式深度学习库, 应用可以直接在 Apache Spark 上使用
  • Apache Spark: 开源集群计算框架 Spark SQL/MLlib/GraphX/Structured Streaming
    • Spark ML pipeline
  • Apache Flink : 开源流处理框架
  • Ray
    • RayOnSpark
  • Apache Hadoop YARN -Yet Another Resource Negotiator - 资源管理系统
    • Apache Hadoop - 分布式系统基础框架
  • K8s - kubernetes - 容器编排引擎
  • Databricks
  • Google Dataproc
  • Cluster Serving
  • Scala
  • TFPark
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant