|
发表于 2022-9-14 09:53:10
|
查看: 1626 |
回复: 0
一、散点图
- m <- read.table("prok_representative.csv",sep = ",",header = T);
- x <- m[,2]
- y <- m[,4]
- plot(x,y,pch=16,xlab="Genome Size",ylab="Genes");
- fit <- lm(y~x);
- abline( fit,col="blue",lwd=1.8 );
- rr <- round( summary(fit)$adj.r.squared,2);
- intercept <- round( summary(fit)$coefficients[1],2);
- slope <- round( summary(fit)$coefficients[2],2);
- eq <- bquote( atop( "y = " * .(slope) * " x + " * .(intercept), R^2 == .(rr) ) );
- text(12,6e3,eq);
复制代码
基因组大小与基因数目相关性散点图
二、基因长度分布直方图
- #基因长度分布图
- x <- read.table("H37Rv.gff",sep = "\t",header = F,skip = 7,quote = "")
- x <- x[x$V3=="gene",]
- # x <- x %>% dplyr::filter(V3 == 'gene')
- x <- abs(x$V5-x$V4)+1
- # x <- x %>% dplyr::mutate(gene_len=abs(V5-V4)+1)
- # head(x$gene_len)
- length(x)
- range(x)
- hist(x)
- hist(x,breaks = 80)
- ?hist
- hist(x,breaks = 'Sturges')
- hist(x,breaks = c(0,500,1000,1500,2000,2500,15000))
- hist(x,breaks = 80,freq = F)
- hist(x,breaks = 80,density = T)
- hist(rivers,density = T,breaks = 10)
- ?hist
- pdf(file = 'hist.pdf')
- h <- hist(x,nclass=80,col="pink",xlab="Gene Length (bp)",main="Histogram of Gene Length");
- rug(x);
- xfit<-seq(min(x),max(x),length=100);
- yfit<-dnorm(xfit,mean=mean(x),sd=sd(x));
- yfit <- yfit*diff(h$mids[1:2])*length(x);
- lines(xfit, yfit, col="blue", lwd=2);
- dev.off()
复制代码
基因长度分布直方图
|
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有账号?立即注册
|