R数据科学 第2版(影印版)
Hadley Wickham, Mine Cetinkaya-Rundel, Garrett Grolemund
出版时间:2024年03月
页数:576
“这是对世界领先的数据科学R语言指南的一次重大更新。所有与数据打交道的人都不应该错过本书!”
——Emma Rand
英国约克大学

使用R将数据转化为洞见、知识和理解。通过这本实践用书,有志成为数据科学家的读者将掌握如何使用R和RStudio从事数据科学,同时还会学习tidyverse,这是一组R软件包的集合,旨在协同工作,使数据科学变得快速、流畅、有趣。即使你没有编程经验,这本更新版也能让你快速上手数据科学。
你将学习如何导入、变换以及可视化你的数据,并传达结果。你将从宏观上全面了解数据科学周期以及管理细节所需的基本工具。全书根据最新的tidyverse特性和最佳实践进行了更新,新的章节向你展示了如何从电子表格、数据库和网站获取数据。书中提供的练习有助于你将理论应用于实践。
你将理解如何:
● 可视化:创建用于数据探索和结果传达的图表
● 变换:发现变量类型及其处理工具
● 导入:将数据以便于分析的形式传给R
● 编程:学习R工具,更清晰、更轻松地解决数据问题
● 交流:用Quarto整合普通文本、代码和结果
  1. Introduction
  2. Part I. Whole Game
  3. 1. Data Visualization
  4. Introduction
  5. First Steps
  6. ggplot2 Calls
  7. Visualizing Distributions
  8. Visualizing Relationships
  9. Saving Your Plots
  10. Common Problems
  11. Summary
  12. 2. Workflow: Basics
  13. Coding Basics
  14. Comments
  15. What’s in a Name?
  16. Calling Functions
  17. Exercises
  18. Summary
  19. 3. Data Transformation
  20. Introduction
  21. Rows
  22. Columns
  23. The Pipe
  24. Groups
  25. Case Study: Aggregates and Sample Size
  26. Summary
  27. 4. Workflow: Code Style
  28. Names
  29. Spaces
  30. Pipes
  31. ggplot2
  32. Sectioning Comments
  33. Exercises
  34. Summary
  35. 5. Data Tidying
  36. Introduction
  37. Tidy Data
  38. Lengthening Data
  39. Widening Data
  40. Summary
  41. 6. Workflow: Scripts and Projects
  42. Scripts
  43. Projects
  44. Exercises
  45. Summary
  46. 7. Data Import
  47. Introduction
  48. Reading Data from a File
  49. Controlling Column Types
  50. Reading Data from Multiple Files
  51. Writing to a File
  52. Data Entry
  53. Summary
  54. 8. Workflow: Getting Help
  55. Google Is Your Friend
  56. Making a reprex
  57. Investing in Yourself
  58. Summary
  59. Part II. Visualize
  60. 9. Layers
  61. Introduction
  62. Aesthetic Mappings
  63. Geometric Objects
  64. Facets
  65. Statistical Transformations
  66. Position Adjustments
  67. Coordinate Systems
  68. The Layered Grammar of Graphics
  69. Summary
  70. 10. Exploratory Data Analysis
  71. Introduction
  72. Questions
  73. Variation
  74. Unusual Values
  75. Covariation
  76. Patterns and Models
  77. Summary
  78. 11. Communication
  79. Introduction
  80. Labels
  81. Annotations
  82. Scales
  83. Themes
  84. Layout
  85. Summary
  86. Part III. Transform
  87. 12. Logical Vectors
  88. Introduction
  89. Comparisons
  90. Boolean Algebra
  91. Summaries
  92. Conditional Transformations
  93. Summary
  94. 13. Numbers
  95. Introduction
  96. Making Numbers
  97. Counts
  98. Numeric Transformations
  99. General Transformations
  100. Numeric Summaries
  101. Summary
  102. 14. Strings
  103. Introduction
  104. Creating a String
  105. Creating Many Strings from Data
  106. Extracting Data from Strings
  107. Letters
  108. Non-English Text
  109. Summary
  110. 15. Regular Expressions
  111. Introduction
  112. Pattern Basics
  113. Key Functions
  114. Pattern Details
  115. Pattern Control
  116. Practice
  117. Regular Expressions in Other Places
  118. Summary
  119. 16. Factors
  120. Introduction
  121. Factor Basics
  122. General Social Survey
  123. Modifying Factor Order
  124. Modifying Factor Levels
  125. Ordered Factors
  126. Summary
  127. 17. Dates and Times
  128. Introduction
  129. Creating Date/Times
  130. Date-Time Components
  131. Time Spans
  132. Time Zones
  133. Summary
  134. 18. Missing Values
  135. Introduction
  136. Explicit Missing Values
  137. Implicit Missing Values
  138. Factors and Empty Groups
  139. Summary
  140. 19. Joins
  141. Introduction
  142. Keys
  143. Basic Joins
  144. How Do Joins Work?
  145. Non-Equi Joins
  146. Summary
  147. Part IV. Import
  148. 20. Spreadsheets
  149. Introduction
  150. Excel
  151. Google Sheets
  152. Summary
  153. 21. Databases
  154. Introduction
  155. Database Basics
  156. Connecting to a Database
  157. dbplyr Basics
  158. SQL
  159. Function Translations
  160. Summary
  161. 22. Arrow
  162. Introduction
  163. Getting the Data
  164. Opening a Dataset
  165. The Parquet Format
  166. Using dplyr with Arrow
  167. Summary
  168. 23. Hierarchical Data
  169. Introduction
  170. Lists
  171. Unnesting
  172. Case Studies
  173. JSON
  174. Summary
  175. 24. Web Scraping
  176. Introduction
  177. Scraping Ethics and Legalities
  178. HTML Basics
  179. Extracting Data
  180. Finding the Right Selectors
  181. Putting It All Together
  182. Dynamic Sites
  183. Summary
  184. Part V. Program
  185. 25. Functions
  186. Introduction
  187. Vector Functions
  188. Data Frame Functions
  189. Plot Functions
  190. Style
  191. Summary
  192. 26. Iteration
  193. Introduction
  194. Modifying Multiple Columns
  195. Reading Multiple Files
  196. Saving Multiple Outputs
  197. Summary
  198. 27. A Field Guide to Base R
  199. Introduction
  200. Selecting Multiple Elements with [
  201. Selecting a Single Element with $ and [[
  202. Apply Family
  203. for Loops
  204. Plots
  205. Summary
  206. Part VI. Communicate
  207. 28. Quarto
  208. Introduction
  209. Quarto Basics
  210. Visual Editor
  211. Source Editor
  212. Code Chunks
  213. Figures
  214. Tables
  215. Caching
  216. Troubleshooting
  217. YAML Header
  218. Workflow
  219. Summary
  220. 29. Quarto Formats
  221. Introduction
  222. Output Options
  223. Documents
  224. Presentations
  225. Interactivity
  226. Websites and Books
  227. Other Formats
  228. Summary
  229. Index
书名:R数据科学 第2版(影印版)
国内出版社:东南大学出版社
出版时间:2024年03月
页数:576
书号:978-7-5766-1189-2
原版书书名:R for Data Science, 2nd Edition
原版书出版商:O'Reilly Media
Hadley Wickham
 
Hadley Wickham,RStudio首席科学家,莱斯大学助理教授,资深R社区成员,已开发了30多个R包。因在数据处理和可视化开发工具方面的卓越贡献,获得专为统计计算而设立的约翰·钱伯斯奖。

Hadley Wickham是RStudio(现已更名为Posit)的首席科学家,2019年COPSS(统计学协会主席委员会)奖得主,R基金会成员。他构建了计算和认知工具,以使数据科学更容易、更快、更有趣。他的工作包括数据科学包(如tidyverse,其中包括ggplot2、dplyr和tidyr)和基础软件开发包(roxygen2、testthat和pkgdown)。他也是一位作家、教育家和演说家,提倡将R语言用于数据科学。您可以从他的网站hadley.nz上了解更多信息。
 
 
Mine Cetinkaya-Rundel
 
Mine Cetinkaya-Rundel是杜克大学统计科学系的实践教授和本科教务主任。她还是Posit公司的开发者教育工作者。
 
 
Garrett Grolemund
 
Garrett Grolemund是Hands-On Programming with R一书的作者,也是Posit公司的学习主管。
 
 
The animal on the cover of R for Data Science is the kakapo (Strigops habroptilus). Also known as the owl parrot, the kakapo is a large flightless bird native to New Zealand. Adult kakapos can grow up to 64 centimeters in height and 4 kilograms in weight. Their feathers are generally yellow and green, although there is significant variation between individuals. Kakapos are nocturnal and use their robust sense of smell to navigate at night. Although they cannot fly, kakapos have strong legs that enable them to run and climb much better than most birds.

The name kakapo comes from the language of the native Maori people of New Zealand. Kakapos were an important part of Maori culture, both as a food source and as a part of Maori mythology. Kakapo skin and feathers were also used to make cloaks
and capes.

Due to the introduction of predators to New Zealand during European colonization, kakapos are now critically endangered, with less than 200 individuals currently living. The government of New Zealand has been actively attempting to revive the kakapo population by providing special conservation zones on three predator-free islands.
购买选项
定价:158.00元
书号:978-7-5766-1189-2
出版社:东南大学出版社