One of the keys to understanding pandas is to understand the data model. At the core of pandas are two data structures:

Data Structure | Dimensionality | Spreadsheet Analog |
---|---|---|

Series |
1D | Column |

DataFrame |
2D | Single Sheet |

The most widely used data structures are the `Series` and the
`DataFrame` that deal with array data and tabular data,
respectively. An analogy with the spreadsheet world illustrates the
basic differences between these types. A `DataFrame` is similar to a
sheet with rows and columns, while a `Series` is similar to a single
column of data.

Diving into these core data structures a little more is useful because a bit of
understanding goes a long way towards better use of the library. We will spend a
good portion of time discussing the `Series` and `DataFrame`. Both the
`Series` and `DataFrame` share features. For example, they both have an
index, which we will need to examine to understand how pandas works.

Also, because the `DataFrame` can be thought of as a collection of columns
that are really `Series` objects, it is imperative that we have a comprehensive
study of the `Series` first. Additionally (and perhaps odd to some), we will see this when we iterate over rows, and the rows are
represented as `Series`.

Note

Some have compared the data structures to Python lists or dictionaries, and I think this is a stretch that doesn't provide much benefit. Mapping the list and dictionary methods on top of pandas' data structures just leads to confusion.

## Summary

The pandas library includes three main data structures and associated functions for manipulating them.
This book will focus on the `Series` and `DataFrame`. First, we will look at the `Series` as
the `DataFrame` can be thought of as a collection of `Series`.