top of page

Cyclistic Bikes

Buisness Task

Analyze Data from casual riders and annual members to discover trends in how they use Cyclistic Bikes differently to determine how to maximize annual memberships.

Data Sources Used

For this case study, I used data from Cyclistic’s historical data for the previous 12 months from May 2021 - April 2022 . The data is a public data set from Motivate International Inc. and was used under this license. The data is organized in excel spreadsheets by month.

Column Titles

ride_id rideable_type started_at ended_at start_station_name start_station_id end_station_name end_station_id start_lat start_lng end_lat end_lng member_casual

The data does not have information about the customers to determine how much an individual uses the service nor any billing or credit card information to determine if customers live in the area. This will also mean that we cannot determine how many times a specific customer is using the service or if casual users convert to members.

Cleaning and Organizing the Data

For the cleaning process, I considered using google sheets, but it was evident that the amount of data in each spreadsheet slowed down the application so I decided to load my data into BigQuery. After loading my data, I created a new table, year_data, and combined the monthly data into one big table to make it easier to work with.

The next thing that I did was check for a variety of missing values in ride_id and rideable_type, both turned up with no missing values.

I then checked for distinct rideable_type and came up with three: electric_bike, classic_bike, and docked_bike.

Upon checking the website, there were only two types of bikes listed, classic_bike and electric_bike. I emailed the source of the data and they confirmed that the classic_bike used to be called a docked_bike and they refer to the same bike so I replaced the instances of ‘docked_bike’ with ‘classic_bike’ for consistency.

The next step was that I checked for missing values at the start or end times of a ride.

There were no missing start or end times in the data. I then checked for duplicates in ride_id.

There were no duplicates in the ride_id, so I moved on to check the user type. I checked for missing values in member_casual as well as how many types were entered.

No missing values and the two types entered were member or casual as expected. The next step I wanted to create was a column to calculate the length of each ride, titled ride_length.

During this step I also removed rides less than 60 seconds as it could be false starts or trying to redock a bike. I also deleted trips over 24 hours as this could be for maintenance of the bike and bikes are marked as stolen after 24 hours.

I then added a column for the day of the week that the trip started and extracted the day of the week from the started_at column.

I used this column to help me find the mode day of the week overall, by member, and casual riders.

The next thing I did was find the mean ride_length in minutes, max ride_length and average ride_length by user type.

I also explored the mean ride length by day_of_week, and number of rides by day of the week by user type.

I also wanted to see how the time of year affected rides.

From here I explored the total trips by user type and bike preference by user type.

Finally, I checked the top 5 start and end stations by user type.

Summary of Analysis

Looking through the data, there are noticeable differences in how members and casual users use Cyclistic bikes. The charts below show the top 5 start and end stations by member type. 

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The most popular stations for casual users were concentrated near tourist attractions such as the beach, children’s museum, millennium park, theater and aquarium. Member stations were further from the coast and near more offices, likely being used to commute for work.

 

Looking at the usage by day of the week and user type we can see that the work week is the most popular time for members while they take longer but fewer rides on the weekend. In contrast, casual users had the fewest rides during the week and the longest and greatest number of rides on the weekend. Casual users average double the ride length of members with casual users averaging 26 minute rides and members averaging 13 minute rides.

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

Weather appears to play a role in bike usage with rides dropping in winter months and peaking in the summer. Members take more trips than casual members except in June, July and August.

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

Classic bikes are favored by both members and casual users, making up 60% of rides in each group.

​

​

Top Recommendations

My top three recommendations to maximize annual memberships:

​

-- Since casual rider trips are longer than member riders, encourage them to sign up for a membership by charging a fee per minute of usage for casual rides to make the membership more appealing.

-- Create a tier membership program that offers different levels of memberships for visitors vs locals.

-- Advertise to casual users at the most popular start/end locations and offer incentives for signing up for a membership.

bottom of page