Ask your own question, for FREE!
Computer Science 8 Online
mhchen:

I have a massive data set of the U.S. census from 1994. It lists the surveyed person's age, education, marital-status, job, race, gender, capital-gain/loss, hours-per-week, native-country, and whether they make less than or more than 50k a year. What questions do you guys want me to find out? (The data set is from https://archive.ics.uci.edu/ml/datasets/adult)

mhchen:

Here's an example question with an example graph I made by myself "who is smarter, males or females" |dw:1570924234468:dw|

mhchen:

|dw:1570925095815:dw| every first-grade dropouts earned less than 50k :( sad

mhchen:

|dw:1570925188921:dw| Of course there's that one dude that wins the jackpot on investment I'm going to figure out who that point represents

mhchen:

Of course LOL

mhchen:

Is this even possible, 99 hours a week, WHAT |dw:1570925773153:dw|

justjm:

Did you arrange all the data graphic or did you find it? I just want to know because it seems a bit interesting. Also wow, pretty cool data.

mhchen:

Yeah I'm using RStudio to make the graphics

justjm:

thank you

justjm:

Does it also find the data for you or do you need to find it yourself

mhchen:

I downloaded the data on my computer. Then I use RStudio to rearrange the data however I want (like only view rows that are male and less than 50 years of age) and with my new data, I can draw plots with it the code looks like this: ```r data %>% group_by(education_num,gender) %>% summarize(n=sum(population)) %>% ggplot(aes(education_num,n)) + geom_line(aes(color=gender)) + theme_bw() + labs(title="Count of Education Levels by Gender",caption="Figure 1.2",x="education level",y="count") + theme(axis.text.x = element_text(angle=50,hjust=1)) ``` and the graph for that would look like this: |dw:1570926247203:dw|

justjm:

Ohh okay, thank you. Is it done on a terminal or its own platform?

mhchen:

On its own platform.

justjm:

Ah okay.

Nnesha:

what does the 2nd graph represents?? how would you interpret it? Perhaps scatter plot is not the best graph to represent this data?

Nnesha:

did you label the horizontal axis or the software did ??

mhchen:

that was just me casually figuring out if any preschool-only educated people earned more than 50k. A scatter plot is the quickest to make so I just made it. the software automatically labels the axis but I can change the axis name

Nnesha:

okay. There are some things that can be done to organize the graphs.

Can't find your answer? Make a FREE account and ask your own questions, OR help others and earn volunteer hours!

Join our real-time social learning platform and learn together with your friends!
Can't find your answer? Make a FREE account and ask your own questions, OR help others and earn volunteer hours!

Join our real-time social learning platform and learn together with your friends!