-
Data Science from Scratch
Data science libraries, frameworks, modules, and toolkits are great for doing data science, but they’re also a good way to dive into the discipline without actually understanding data science. In this book, you’ll learn how many of the most fundamental data science tools and algorithms work by implementing them from scratch. If you have an aptitude for mathematics and some programming skills, author Joel Grus will help you get comfortable with the math and statistics at the core of data science, and with hacking skills you need to get started as a data scientist. Today’s messy glut of data holds answers to questions no one’s even thought to ask. This book provides you with the know-how to dig those answers out. Get a crash course in Python Learn the basics of linear algebra, statistics, and probability—and understand how and when they're used in data science Collect, explore, clean, munge, and manipulate data Dive into the fundamentals of machine learning Implement models such as k-nearest Neighbors, Naive Bayes, linear and logistic regression, decision trees, neural networks, and clustering Explore recommender systems, natural language processing, network analysis, MapReduce, and databases -
统计学习基础(第2版)(英文)
-
现代模式识别
《现代模式识别》系统深入地论述了模式识别的理论与方法、较全面地介绍了本学科的新近科技成果。全书共12章,讨论的主流模式识别技术是:统计模式识别、模糊模式识别、神经网络技术、人工智能方法、句法模式识别。第一章为引论,第二章至第七章介绍的统计模式识别包括聚类分析、判别代数界面方程法、统计判决、训练学习与错误率估计、特征提取与选择以及最近邻法,第十一章信息融合集中论述识别与决策中的有关融合技术,第十二章人工智能方法侧重论述不确定推理,其他类型识别方法在其余各章分别介绍。 -
Graph-based Natural Language Processing and Information Retrieval
Graph theory and the fields of natural language processing and information retrieval are well-studied disciplines. Traditionally, these areas have been perceived as distinct, with different algorithms, different applications, and different potential end-users. However, recent research has shown that these disciplines are intimately connected, with a large variety of natural language processing and information retrieval applications finding efficient solutions within graph-theoretical frameworks. This book extensively covers the use of graph-based algorithms for natural language processing and information retrieval. It brings together topics as diverse as lexical semantics, text summarization, text mining, ontology construction, text classification, and information retrieval, which are connected by the common underlying theme of the use of graph-theoretical methods for text and information processing tasks. Readers will come away with a firm understanding of the major methods and applications in natural language processing and information retrieval that rely on graph-based representations and algorithms. -
统计模型
《统计模型:理论和实践(英文版·第2版)》内容简介:Some books are correct. Some are clear. Some are useful. Some are entertaining. Few are even two of these. This book is all four. Statistical Models: Theory and Practice is lucid, candid and insightful, a joy to read. We are fortunate that David Freedman finished this new edition before his death in late 2008. We are deeply saddened by his passing, and we greatly admire the energy and cheer he brought to this volume——and many other projects——-during his final months. -
Modern Multivariate Statistical Techniques
This is the first book on multivariate analysis to look at large data sets which describes the state of the art in analyzing such data. Material such as database management systems is included that has never appeared in statistics books before.