Understanding Data Types in Python

Variables in Python have a data type that determine what kind of value it holds and what you can do with it. If you’re working with data, understanding data types is important.


Last week, we introduced the dataset we’ll be using throughout the series. We’ll return to it soon, but first, we need to understand some additional aspects of Python, especially how it handles different types of data.

Common Data Types in Python

Python has several built-in data types. Below are the most commonly used ones.

1. Strings (str) – Used for text values

name = "Alice"  # A string (text value representing a name)
print(name)  # Output the string value

2. Integers (int) – Used for whole numbers

age = 25  # An integer (a whole number representing age)
print(age)  # Output the integer value

3. Floats (float) – Used for decimal numbers

height = 5.7  # A floating-point number (a decimal representing height)
print(height)  # Output the float value

4. Booleans (bool) – Used for True/False values

is_student = True  # A boolean (True/False value indicating student status)
print(is_student)  # Output the boolean value

5. None Type (NoneType) – Represents a missing or undefined value

grade = None  #  This means "no value"
print(grade)  # Output None

A quick note on NonType

None is a special value in Python that means “nothing” or “no value here.” When you assign a variable to None, Python gives it the data type NoneType.

It’s more of a descriptive way to refer to a variable that doesn’t hold a value of a built-in type like int, float, str, list, dict, bool, etc.

It’s useful when:

  • You want to say “this doesn’t have a value yet.”
  • A function doesn’t return anything.

You might hear someone say a variable is “non-type,” but what they really mean is that the variable’s value is None.

How to Check a Variable’s Data Type

Use the type() function to find out what kind of value you’re working with.

Its pretty much all the same thing, but i’m showing it here for emphasis. because you always want to check what type of value you are working with. You will realize the importance of this when you are working with a dataset where values might (intentionally or unintentionally) have unexpected types.

print(type(name))  # Check the type of a string variable
print(type(age))  # Check the type of an integer variable
print(type(height))  # Check the type of a float variable
print(type(is_student))  # Check the type of a boolean variable
print(type(grade))  # Check the type of a NoneType variable

Converting Between Data Types

Sometimes, you need to change or convert a variable’s data type. This is called type conversion.

1. Converting a String to an Integer

num_str = "100"  # A string containing numeric characters
num_int = int(num_str)  # Convert to an integer for numerical operations
print(num_int, type(num_int))  # Output converted integer and its type

2. Converting an Integer to a Float

num = 42  # A whole number (integer)
num_float = float(num)  # Convert to a floating-point number
print(num_float, type(num_float))  # Output converted float and its type

3. Converting a Number to a String

age = 30  # An integer representing age
age_str = str(age)  # Convert to a string for text display
print(age_str, type(age_str))  # Output converted string and its type

4. Converting Values to Boolean

print(bool(0))  # False
print(bool(1))  # True, any non-zero number is 'True'
print(bool(""))  # False, an empty string is 'False'
print(bool("Hello"))  # True, non-empty strings are 'True'

If you notice, I am not overwriting variables. Instead, i am creating a new variable each time. That helps to avoid confusion when you need the same values of different types for varying reasons.

Quick Practice

Try these out to test what you’ve learned.

1️. Check the type of a variable

value = "42"  # A string containing numeric characters
print(value, type(value))  # Output value and its type

2️. Convert a string to a float and print it

num = "10.5"  # A string representation of a decimal number
num = float(num)  # Convert the string into a float
print(num, type(num))  # Output converted float and its type

3️. Try converting a boolean to an integer

boolean_value = True  # A boolean variable
int_value = int(boolean_value)  # Convert to integer
print(int_value, type(int_value))  # Output converted integer and its type

Summary

  • Python has several built-in data types: strings, integers, floats, booleans, and NoneType.
  • Use type() to check a variable’s data type/the kind of value you’re working with.
  • Convert data types using int(), float(), str(), and bool().

Next, we will introduce data structures in Python, starting with lists and dictionaries.

Recommended Python Books

Introducing the Dataset

As I mentioned in the last post, to make our lessons more meaningful, we’ll be working with a fictional dataset that contains demographic, socioeconomic, and health variables. Think of it as data collected from a survey.

The dataset has 100 rows, and each row represents one person.

Included Variables

Demographic Variables

  • ID – A unique ID number for each person (integer)
  • State – The U.S. state where the person lives (string)
  • City – The person’s city of residence (string)
  • Year of Birth – The person’s birth year (integer)
  • Age – The person’s age, based on the current year (integer)
  • Sex – Gender ("Male" or "Female") (string)
  • Race – Racial category ("White", "Black", "Hispanic", etc.) (string)

Socioeconomic Variables

  • Education – Highest level of education completed (e.g. "High School", "Bachelor's")
  • Income – Annual income in thousands of dollars (integer; some values may be missing)
  • Employment Status – Whether they’re employed, unemployed, a student, or retired (string)

Health Variables

  • Self-Rated Health – How the individual perceives their own health ("Excellent", "Good", "Fair", "Poor") (string)
  • Hypertension – Whether the individual has high blood pressure ("Yes", "No") (string)
  • BMI – Body Mass Index, calculated from height and weight (float, may contain missing values)

Example Data Row

Here’s what rows in the dataset might look like.

IDStateCityAgeSexRaceEducationIncomeEmployment StatusSelf-Rated HealthHypertensionBMI
123456TexasHouston34MaleBlackBachelor’s55000EmployedGoodNo24.5
987654CaliforniaLos Angeles29FemaleWhiteHigh School32000StudentExcellentNo22.1

Understanding Data Types

When working with survey data, you’ll often see a mix of data types.

  • Strings: Text-based values like names and cities.
  • Integers: Whole numbers like age and income.
  • Floats: Decimal numbers, like BMI.
  • Categorical Variables: Limited values like education levels or employment status.

Quick Practice

Let’s create a small version of our dataset using Python variables. We learned how to create variables in Python in the last post.

person_id = 123456 # Unique ID (integer)
state = "Texas" # State name (string)
age = 34 # Age in years (integer)
income = 55000 # Income in dollars (integer)
is_employed = True # Boolean variable for employment status
print("Person ID:", person_id)
print("Lives in:", state)
print("Age:", age)
print("Annual Income:", income)
print("Employed?", is_employed)

Breaking Down the Code

Let’s analyze each part of our code to understand what’s happening and how it relates to our dataset. Below are some questions, I get from students.

1. What is Immediately After print?

After print, we see a string (in quotes) followed by a variable. The string inside print() is not tied to the dataset—it’s just a label that makes the output readable.

Each print( ) line has:

  • A label in quotes (e.g., “Person ID:”)
  • A variable (e.g., person_id)

Example:

print("Person ID:", person_id)
  • "Person ID:" → This is a string that acts as a label.
  • person_id → This is a variable holding a value (an integer in this case).
  • The comma (,) separates the label and the actual variable value.

Output:

Person ID: 123456

Keep in mind that the label does NOT change the variable name.

These would all be the same. We’re just changing how we describe the output.

They will work because the variable name (is_employed) does not change, only the text inside print() changes.

print("Is the person employed?", is_employed)
print("Job Status:", is_employed)
print("Employment Check:", is_employed)
print("Works for a company?", is_employed)

2. Why Use Colons (:) in print Statements?

The colon (:) in print statements is just my preference for formatting the output.

Example:

print("Annual Income:", income)

Prints like this:

Annual Income: 55000

It separates the label from the value, which makes the output more readable.

But you could also do:

print("Annual Income", income)

Which prints:

Annual Income 55000

Both are fine, but the colon just makes it a little easier to read.

In the next post, we’ll start working with different data types and operations.

Recommended Python Books

Python Variables

By now, you should have installed Python on your computer. Follow these steps if you still need to do so.

A variable is like a labeled box that holds information. You give it a name, assign it a value, and the program, in this case, Python, holds onto that value for you while it runs your code.

Example:

name = "Alice"  # Assigning a string to the variable 'name'
age = 25        # Assigning an integer to the variable 'age'
height = 5.7    # Assigning a float to the variable 'height'

Here, name stores a string ("Alice"), age stores a number (25), and height stores a float (5.7).

Variable Naming Rules

You can name your variables almost anything, but there are a few rules to follow in Python. The variable:

  • Must start with a letter or an underscore _
  • Can include letters, numbers, and underscores
  • Is case-sensitive (Age and age are different variables!)
  • Has no spaces or special symbols like @, $, !

Changing Variables

Variables can change over time. Here, age was first set to 25. Then we changed it to 26. Now, age holds the new value.

age = 25  # Initial value
age = 26  # Now age equals 26 - Reassigned value

Quick Practice

Try running this code. You should see a simple sentence printed out.

city = "New York"  # Assigning a city name
temperature = 75  # Assigning a temperature value
print("I live in", city, "where the temperature is", temperature, "degrees.")

Try changing the values and running it again.

In the next post, we’ll introduce the fictional dataset we’ll be working with throughout the series. I’ve found that I learn best when working with meaningful data, so I created a sample dataset that feels similar to something you might actually come across in health or demographic analyses.

We will check the dataset out in the next post. See you there.

Recommended Python Books

Starting with Python

Before we actually start writing code, we need to download and install Python and then pick a code editor. That’s what we’ll cover in this short post.

Installing Python

Step 1: Download Python

  1. Go to python.org.
  2. Download the latest version for your operating system (Windows, macOS, or Linux).
  3. During installation, make sure to check the box that says “Add Python to PATH.”
  • This makes sure you can run Python from the command line without extra setup.

Step 2: Verify the Installation

Let’s check that Python was installed correctly.

  • On Windows: Open Command Prompt (search for cmd in the Start Menu).
  • On Mac: Open Terminal (search for Terminal in Spotlight Search).

Type this command:

python --version

If everything worked, you’ll see the version number printed.

Troubleshooting Installation Issues

If something doesn’t go quite right, don’t worry—there are plenty of other guides to walk you through it.

Here are some useful guides from geeksforgeeks:

Choosing an Editor

Python code can be written in a bunch of different editors. Try a few out and see what feels right.

  • Jupyter Notebook – Great for beginners and especially good for data analysis.
  • VS Code – An editor with excellent Python support. This is my go-to because I use multiple languages. It is extensible, so you can add plugins and customize it however you like.
  • IDLE– Python’s built-in editor. It’s basic, but it does the job.
  • PyCharm – PyCharm is built specifically for Python. Great if you’re doing bigger projects.

Running Python Code

There are several ways to run Python:

  1. In a script – Save your code in a .py file and run it.
  2. In a Jupyter Notebook – Great for running code one chunk at a time.
  3. In the terminal/command prompt – You can run Python interactively right from the command line.

Try this simple command in  whichever setup you choose:

print("Hello, Python!")

If you see that message printed back to you, you’re good to go!

What’s Next?

In the next post, we’ll introduce variables. After that, we’ll take a look at the fictional dataset we’ll be using throughout the series.

See you in the next one!

Recommended Python Books

What Is Python?

Python is a popular programming language – it is used in many industries and used for a variety of things. I have written a bit about it before. See here when I was trying to understand if Python is easier to learn than R.

Python is known for its simplicity, readability, and versatility. Because it is a general-purpose language, it can be used for many different types of programming, including data analysis, web development, automation, and machine learning.

See below for a list of what Python can be used to accomplish.

Data Analysis, Data Visualization, and Model Building

This blog series will focus mostly on Python for data analysis and will cover a range of topics like analyzing data with Pandas and NumPy and creating visualizations with Matplotlib and Seaborn.

We will also go over some important topics in model building. Although Python can be used to develop AI models using Scikit-Learn and TensorFlow, we will focus on much simpler models in this series.

Python is also a great tool for automating repetitive tasks. However, we will not cover much of this. It is good to know that Python can writes scripts to automate things like renaming files or processing data.

Web development is another area where Python excels, as it can also be used to create websites and web apps with frameworks like Flask and Django.

Python is also widely used in penetration testing and cybersecurity (think ethical hacking). We will not cover this aspect at all.

Because of its versatility, applications of Python extend beyond data science and web development and can include:

  • Game development with Pygame.
  • Building desktop applications with Tkinter.
  • Writing scripts for system administration and networking

Python is one of the most commonly used languages in data analysis, which makes it an excellent choice if you are just starting to work with data. It is also one of the better languages to learn if you are new to programming.

In the next post, we’ll go over how to install Python.

Recommended Python Books

Let’s Learn Python. Finally!

If you are here, it means you’re interested in learning Python! That’s great! I’ve put it off for a while and now i’m getting into it.

People say Python is one of the easiest programming languages to learn, but I’m still not sure about that. Easy is relative. However, if you want to improve your programming skills or desire to get into data science or machine learning, Python is a great place to start.

I often work with Stata and R but I have recently chosen to learn Python programming–starting first with an area (data analysis) I’m already proficient in.

I committed to improving my data analysis skills a while ago, and writing about it got tedious. But now, I need to learn Python for work. So what better way to cement my learning than to teach others through a blog series?

The Python Blog Series

This Python blog series will be useful for beginners like me—people who want a structured, easy-to-follow guide with clear examples and useful explanations.

By the end of it, you should be able to write basic Python code. If not, just search for the 100s of videos on YouTube on the topic.

The plan is to post a new/short lesson at least once a week (let’s see how that goes).

What Will We Cover?

Each post will focus on a specific topic – likely something I am working through at a specific point in time.

  • Python basics (variables, data types, functions)
  • Working with Pandas to analyze data
  • Cleaning and filtering datasets
  • Summarizing data with statistics
  • Plotting graphs and visualizing trends
  • Performing regressions and predictive modeling

We will take it step by step. Each post will focus on one concept at a time. The goal is to provide clear and simple explanations!

I work best will realistic data so I created a fictional dataset we can use. We will get into it later.

Who Is This Series For?

If you’ve never coded before, or if you’ve used another programming language but want to switch to Python, this will be a great series for you.

In the next post, we’ll learn more about Python and try to understand why this programming language might be helpful.

Recommended Python Books

Which is easier to learn: Python or R?

As many of you know, I am committed to learning and writing in public. That means writing about what I learn, showcasing the tools and resources I use, and showing my work or unfinished products.

My current goal is to learn Python or R. While I am familiar with both, I need to start from scratch to advance my skills. Ideally, I would like to start with the easier language (based on my current skill level and knowledge of other programming tools). But is Python easier to learn than R, or is R the easier language?

Some caveats

I am interested in data analyses, graphics/visualization, and general programming. While I figured that there are some things Python may do better than R or vis versa, I am interested in using the more robust tool.

I am also well versed in Stata and I’m satisfied with my knowledge of the app, but I find that I’m limited by not knowing one of the widely recognized programming languages. I am also aware that Stata has PyStata, a new Python and Stata integration that allows users to use Stata from a Python environment. This feature is an added plus for me to pursue learning Python.

Two important notes

  • One is not necessarily better than the other. I am only trying to decide which tool best fits my needs.

What is Python?

Python is a general programming language that serves multiple purposes. Python is used for web and software development, scripting, and data analysis. Users can access different packages developed by other Python users, including SciKit-learn, SciPy, and NumPy.

Some users suggest that Python is a more general tool than R.

What is R?

R is a language and environment for statistical computing. R has a library of open-source software that makes it a rich program, including tidyverse, dplyr, and tidyr. It is primarily used for statistical analysis. R would be sufficient if I were only interested in data analyses and modeling (e.g., linear and nonlinear regression).

So far, I’ve learned three important things about R:

  • R Syntax is not as readable as Python
  • It is not necessarily a programming language
  • It is mainly for statistics, though the program allows for other possibilities

Python or R?

Both Python and R are free, open-source programming languages. Both have supportive communities that contribute to different packages. Visualization in R is generally better than Python for some users of both applications. I’m familiar with ggplot2 because I’ve come across many excellent R visualizations that have used the package to produce high-quality graphics.

Overall, learning Python is more accessible and friendly for people just getting started with coding. The general advice is to learn Python if you are new to coding; learn R if you are focused mainly on statistics or data analyses.

I was told to decide whether to learn Python or R based on my goals. Since I am not only focused on data analyses, I’ll stick with learning Python and look to strengthen my R skills down the line. Python and R can also communicate with each other, so may look into setting this up once I become proficient with both.

You may be interested in these posts…


You may support me with a generous cup of coffee.