SnailDove's blog

蜗牛哥博客


  • Home

  • About

  • Tags

  • Archives

  • Sitemap

  • Search

HyperLogLog估计算法模拟

Posted on 2019-11-14 | In English
Words count in article 1.3k | Reading time ≈ 6
源码模拟试验说明:不考虑论文《HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm》中的常数无偏估计修正因子,以尽量简单的方式模拟,而且从结果中可以看出分桶数据(数组)大小会影响HyperLogLog算法的精确度,也因此反过来可以理解常数无偏估计因子需要根据分桶数据(数组)大小适时调整。 12345 ...
Read more »

翻译 Chapter 12 Resilient Distributed Datasets (RDDs)

Posted on 2019-11-07 | In English
Words count in article 11.6k | Reading time ≈ 55
Chapter 12. Resilient Distributed Datasets (RDDs)The previous part of the book covered Spark’s Structured APIs. You should heavily favor these APIs in almost all scenarios. That being said, there ar ...
Read more »

翻译 Chapter 11 Datasets

Posted on 2019-11-07 | In English
Words count in article 6.2k | Reading time ≈ 29
Chapter 11 Datasets译者:https://snaildove.github.io Datasets are the foundational type of the Structured APIs. We already worked with DataFrames, which are Datasets of type Row, and are available acr ...
Read more »

翻译 Chapter 14 Distributed Shared Variables

Posted on 2019-11-07 | In English
Words count in article 3.9k | Reading time ≈ 18
Chapter 14 Distributed Shared VariablesIn addition to the Resilient Distributed Dataset (RDD) interface, the second kind of low-level API in Spark is two types of “distributed shared variables”: bro ...
Read more »

翻译 Chapter 13 Advanced RDDs

Posted on 2019-11-07 | In English
Words count in article 8.3k | Reading time ≈ 39
Chapter 13. Advanced RDDsChapter 12 explored the basics of single RDD manipulation. You learned how to create RDDs and why you might want to use them. In addition, we discussed map, filter, reduce, ...
Read more »

翻译 Chapter 3 A Tour of Spark’s Toolset

Posted on 2019-11-07 | In English
Words count in article 10.1k | Reading time ≈ 47
Chapter 3. A Tour of Spark’s Toolset译者:https://snaildove.github.io In Chapter 2, we introduced Spark’s core concepts, like transformations and actions, in the context of Spark’s Structured APIs. Th ...
Read more »
<i class="fa fa-angle-left"></i>1…456…24<i class="fa fa-angle-right"></i>
SnailDove

SnailDove

keep enthusiasm

142 posts
3 categories
36 tags

Tag Cloud

  • Basic Algorithm1
  • Big Data1
  • Calculus and Differential1
  • Data Structure2
  • Distributed System2
  • Estimate1
  • Hadoop YARN2
  • Improving Deep Neural Networks7
  • Information Theory1
  • Latex1
  • Machine Learning27
  • Machine Learning by Andrew NG1
  • Machine Learning.feature engineering1
  • NLP1
  • Python Data Science Cookbook1
  • Redis1
  • Spark31
  • Structuring Machine Learning Projects3
  • XGBoost1
  • convolutional-neural-networks11
  • deep learning41
  • distributed compute1
  • distributed system1
  • distributed-system1
  • english1
  • google1
  • hexo1
  • java1
  • kaggle1
  • linear_algebra13
  • linux1
  • neural-networks-deep-learning8
  • nlp-sequence-models11
  • papers1
  • probability13
  • 统计学习方法4
RSS
E-Mail Weibo
Links
  • Linear Algebra on MIT
  • Probability-and-statistics on MIT
© 2018 — 2023 SnailDove | Site words total count 929.1k
Visitors Total hits
0%