Continuing from the previous conversation, I got an idea which may or may not help us to improve the r2 score of the model.
The idea was why not introduce a third independent variable?
So, for this, I turned toward Google and began researching the leading cause of diabetes.
After a long and intense search on Google, I came to know about a very interesting relationship between diabetics and the age of the patient.
What I found is that diabetes and age have a linear relationship with each other. It is found that the number of patients increases as the age of them increases.
From I got the idea of introducing the third independent variable “age factor”
In addition, I found some data on the CDC website but that data is divided based on states and not on the county, so I’ll be modifying the current data.
But I have some doubts which I want to solve with the professor so I will first get some doubts cleared and then I try to test my theory.