Hi all, I am very new to R and is not familiar with R scripting.
may i know is it possible to plot a graph by first letter
for example: 

Name:                       Age: 
Angel                        20 
Amelia                      20 
Bernard                     19 
Stephanie                  20 
Vanessa                    22 
Angeline                   23 
Camel                       21 

If I want to plot the name started with letter 'A' and their Age, how is it possible to plot it? 

I am assuming that you have more than 200 names, otherwise your analysis would be weak. Better, if you have 5000 names broken down per country, then you could provide great visual insights.

I do have more than 200 names. The example stated above is just an example. I have more than 1000 names in my database.

Do you have any idea how to plot the graph base on first letter of the name?

I just drop an one of examples to do it.

Assuming "testdata.csv" has CSV format data.


dat <- read.csv("testdata.csv")
dat$Name.first <- toupper(substr(dat$Name,1,1)) # extract first letter into $Name.first
dat$Name.first <- factor(dat$Name.first, levels = LETTERS, ordered = TRUE)  # convert the letters to factor

plot(dat$Name.first, dat$Age)  # plot :)

Hi, however the y axis graph wont show the name of the person? 

I'm writing a few of example for plotting with names. However I believe it must be impractical way to show more than 1000 of names on the graph.


## [1] - plot name text.

plot(dat$Name.first, dat$Age)
text(dat$Name.first, jitter(dat$Age), dat$Name, cex=0.8,col="blue")


library(lattice)  # library 'lattice' is required for xyplot()

## [2] plot by name on Y axis.
xyplot(Name ~ Age , dat )


## [3] plot by name based on first letter.
xyplot(Name ~ Age | Name.first , dat )


Thanks alot. 

What if in my data, it happened to have two 'Angel'.

how do I plot only the Angel and its age data?

Two 'Angel' that you mentioned would be 'Angel' and 'Amelia', they are same of 20-years and had been seen in same point. The function jitter() is used to avoid overlapping, but it seems doesn't work at the time you're running the script.

It's easy way to extract only record that satisfy the condition, e.x.) name is "Angel".

dat.subset <- subset(dat, Name == "Angel")

You're able to plot with new data frame.

Hi Gen! 

Another scenario was what if i wants to plot only variable that is less than 200?

I used the plot(data.table$money > 200) however I felt that it wasn't the graph i wanted.

any help?


