A blog by a Bioinformatician for people working in Life Sciences Research and NGS. I share my day to day events, thoughts and useful code snippets.
Don't wanna be here? Send us removal request.
Note
Dear Sukhdeep, I have BED formatted files for the PolyA seq data from the UCSC genome browser. I want to have the BIGWIG format for the same. However, while doing google search I came across your blog but unfortunately your post which mentions the R script is no longer available. Can you please help.
Hi, I just noticed this. Search the forum here biostars.orgCheck this https://www.biostars.org/p/42844/ if can’t find an answer, post a new question there.Good Luck!
0 notes
Quote
A thing long expected takes the form of the unexpected when at last it comes. Mark Twain
1 note
·
View note
Photo





You Wish Your Neurons Were This Pretty
When Greg Dunn finished his Ph.D. in neuroscience at Penn in 2011, he bought himself a sensory deprivation tank as a graduation present. The gift marked a major life transition, from the world of science to a life of meditation and art.
Now a full-time artist living in Philadelphia, Dunn says he was inspired in his grad-student days by the spare beauty of neurons treated with certain stains. The Golgi stain, for example, will turn one or two neurons black against a golden background. ”It has this Zen quality to it that really appealed to me,” Dunn said.
7K notes
·
View notes
Text
How to convert text files to all upper or lower case [reblogged]
As usual, in Linux, there are more than 1 way to accomplish a task.
To convert a file (input.txt) to all lower case (output.txt), choose any ONE of the following:
dd $ dd if=input.txt of=output.txt conv=lcase tr $ tr '[:upper:]' '[:lower:]' < input.txt > output.txt awk $ awk '{ print tolower($0) }' input.txt > output.txt perl $ perl -pe '$_= lc($_)' input.txt > output.txt sed $ sed -e 's/\(.*\)/\L\1/' input.txt > output.txt
We use the backreference \1 to refer to the entire line and the \L to convert to lower case. To convert a file (input.txt) to all upper case (output.txt):
dd $ dd if=input.txt of=output.txt conv=ucase tr $ tr '[:lower:]' '[:upper:]' < input.txt > output.txt awk $ awk '{ print toupper($0) }' input.txt > output.txt perl $ perl -pe '$_= uc($_)' input.txt > output.txt sed $ sed -e 's/\(.*\)/\U\1/' input.txt > output.txt
These oneliners can be used to convert the lowercase chars in FASTA file to uppercase and vice versa etc. Cheers
#linux#convert#convertfiles#uppercase#lowercase#perl#sed#awk#dd#oneliner#programming#coding#tricks#trick#tips#tip#ubuntu#textfile#text#bioinformatics#fasta#change#convertfasta#computational biology
0 notes
Text
Overlay Multiple Tracks in UCSC Browser
Guys, a quick tutorial on how to overlay multiple tracks in UCSC genome browser.
If you are acquainted with how to visualize your NGS data with UCSC, then move forward with this tutorial, else read this post first. [http://biofeed.tumblr.com/post/45142855966/visualizing-chip-seq-data-using-ucsc]
You will need a publicly accessible web-server, bigwig files and a text editor.
Fire up the text editor, and make a file called hub.txt.
We will make Track hubs, from the official page of UCSC
Track hubs are web-accessible directories of genomic data that can be viewed on the UCSC Genome Browser alongside native annotation tracks. Hubs are a useful tool for visualizing a large number of genome-wide data sets. For example, a project that has produced several wiggle plots of data can use the hub utility to organize the tracks into composite and super-tracks, making it possible to show the data for a large collection of tissues and experimental conditions in a visually elegant way, similar to how the ENCODE native data tracks are displayed in the browser.
This will be your control file for navigating the tracks as hubs.
My hub.txt contains the following lines:
https://gist.github.com/vanbug/5187385
where hub is followed by the parent directory name, which will contain all the tracks and this text file (hub.txt) genomesFile will contain the information about which genomes are required to view the files for on UCSC. They should be standard UCSC annotations like mm9,mm10,hg18 etc. Rest of the fields are self-explanatory and required. Now, create another text file called genomes.txt, which should have
genome mm10 trackDb mm10/trackK4Db3.txt
where genome is the genome to which I mapped my files
and trackDb is the trackDb file which contains the information about your tracks
Now, make a directory inside your parent directory. Lets call it mm10.
Inside this directory, make another text file called trackK4Db3.txt, as declared earlier.
My trackDb file has these entries
https://gist.github.com/vanbug/5187466
The first para is the parent container, lets call it major. It will contain all the information about the display of child tracks. You can set the options like viewLimits,windowingFunction etc. type is really important parameter, telling about the type of your file (bigwig in this case)
Follows then the information about each track which will be contained under this parent container.
Like this first one in H3K4me3, bigDataUrl will have the location of the file, it can be a hyperlink/ftp or simply the file location.
parent parameter conveys the container information.
Another important parameters in the parent container are the
>aggregate transparentOverlay showSubtrackColorOnUi on
Now, with all set, we will give the port in the url of the hub file to the UCSC.
As we are taking about mm10 here, so append the url of the hub file to this url, http://genome.ucsc.edu/cgi-bin/hgTracks?db=mm10&hubUrl=
So, the final url is http://genome.ucsc.edu/cgi-bin/hgTracks?db=mm10&hubUrl=https://sukhdeeps_webdav/hub.txt
This will produce the following pretty image
Small Trick : Adding cpgIslandExt=pack to the url with switch on the cpg islands.
Cheers
#bigwig#ucsc#genomebrowser#genome#computational epigenetics#epigenetics#overlaytracks#overlay#multiplebigwig#multiWig#visualization#cpg#mm10#mm9#hg19#hg18#chip-seq#rna-seq#chipseq#rnaseq
1 note
·
View note
Video
youtube
Genetics and Intelligence
#genetics#artificial intelligence#intelligence#youtube#googletalks#science#bioinformatics#Genetics and Intelligence#video#computational biology
0 notes
Photo
PLOS Computational Biology: Translational Bioinformatics
'Translational Bioinformatics' is a collection of PLOS Computational Biology Education articles which reads as a "book" to be used as a reference or tutorial for a graduate level introductory course on the science of translational bioinformatics.
Translational bioinformatics is an emerging field that addresses the current challenges of integrating increasingly voluminous amounts of molecular and clinical data. Its aim is to provide a better understanding of the molecular basis of disease, which in turn will inform clinical practice and ultimately improve human health.
The concept of a translational bioinformatics introductory book was originally conceived in 2009 by Jake Chen and Maricel Kann. Each chapter was crafted by leading experts who provide a solid introduction to the topics covered, complete with training exercises and answers. The rapid evolution of this field is expected to lead to updates and new chapters that will be incorporated into this collection.
Collection editors: Maricel Kann, Guest Editor, and Fran Lewitter, PLOS Computational Biology Education Editor.
Download the full Translational Bioinformatics collection here: PDF | EPUB | MOBI
Read PLOS Computational Biology Founding Editor-in-Chief Phil Bourne's blog post: 'Let's make those book chapters open too!'
Collection URL: www.ploscollections.org/translationalbioinformatics
#bioinformatics#ngs#translational bioinformatics#chip-seq#rnaseq#computational biology#biology#plosone#opensource#journal
8 notes
·
View notes
Text
Visualizing Chip-Seq data using UCSC [Bigwig]
Hola!
I would like to write a quick tutorial on viewing your coverage/track/wiggle files using UCSC. I am assuming you know command line tools, Linux operations and have knowledge about analysing Chip-Seq data.
So, once you have called peaks or you have coverage files (bedGraph) from a Chip-Seq or RNA-Seq data, the next step is to visualize data in UCSC or IGV. To proceed, the coverage file should be converted to wig (wiggle plots) or a better way bigwig format. You can also go to UCSC and add custom track using this URL, upload your bedGraph file/ coverage file which is in bed format but represent your peaks. So, you can upload how many files you want and then view them (Limit per file ~ 1000Mb). You can also make .wig files and upload them the same way. There are session track, if you reset UCSC session, they will be removed and can't be shared (unless you use the same computer)
We will talk about a better way, by converting the coverage or wig files to bigwig files which are kept on a local server and the data is fetched through the file via UCSC in the user specified range. It just pulls the view in the range specified by user. This makes it more fast, no need to upload files, easy sharing of tracks, tracking is easy and big files can be viewed >1 GB.
First step : Installing
Grab the tool from ucsc ftp, according to your choice of OS. The tool is called bedGraphToBigWig, if you want to convert a coverage file to bigwig else wigToBigWig. There are tools present there for back conversion as well like bigWigToWig, if you need them later. Also, fetch this utility calledfetchChromSizes to get the chromosome size of your organism of interest.
Second step : Conversion
Usage : bedGraphToBigWig file.bed mm9.chromSizes file.bw Output is a bigwig file
Third Step : Uploading
Now you have the bigwig file so lets upload it to ucsc. For that, just copy it on your local webserver, where you can get the link to the file likehttps://projects/files/file.bw. Open your favourite text editor and add the track lines as :
https://gist.github.com/vanbug/5138694
You can add mutiple track line for whatever samples you have. If you want to hide a specific track, just comment it out using '#'.
Now, name the file as bigwigCaller.txt and call it from UCSC as https://genome.ucsc.edu/cgi-bin/hgTracks?db=mm9&position=chr6:122657583-122663796&hgct_customText=https://projects/files/bigwigCaller.txt
So, I specifed the organism as mm9 and position from chr6:122657583-122663796, you can anytime change this. This will be your default view. I have included other options as well which can be turned on/off at any time.
This link can be shared with anyone and multiple users can view the track at same time without destroying the original file. This can be password protected as well, you will then just have to supply it in URL which might be a security threat.
You can use url shortener to shorten the URL. The users/viewers can right click on the track in the UCSC and change the track properties (limiting bars, colour , smoothness etc), so it is customizable as well. Of course, you can import the bigwig in IGV as well, I have tested it works.
For some more automated users and users working in R, we can also upload the track directly from R using the package called rtracklayer
library(rtracklayer) chip.tmp<-tempfile() export(chip.cov,chip.tmp,"bedGraph") restored.chip.track <- import(chip.tmp,"bedGraph",genome = "mm9") session <- browserSession("UCSC") track(session, "target") <- restored.chip.track browserView(session,range(restored.chip.track))
Check the package manual for more parameters and explanation. The above code was used in the EuTRACC 2010 pipeline.
I hope that helps.
Cheers
P.S. Ported in from my Biostars Post
#bigwig#ucsc#ngs#chip-seq#rna-seq#rnaseq#visualization#visualize#bedgrah#bed#bedfile#plot#genome#genomebrowser#biostars#sukhdeepsingh#vanbug#rtracklayer#R#chip#bedGraphToBigWig#chrom#chromsizes#mm9#mm10#mouse#human#bioinformatics
1 note
·
View note
Photo

Genius is a rather complex mixture of emotions and progressions !!
8 notes
·
View notes
Text
Download encrypted file in background using Wget
Fire up the terminal and replace the username and password with your login details and you can donwload a heavy file or a list of files in the backgroud.
https://gist.github.com/vanbug/5120515
where files, is a text file containing the link of the files to be downloaded.
0 notes
Quote
Our scientific power has outrun our spiritual power. We have guided missiles and misguided men. Martin Luther King, Jr.
1 note
·
View note
Text
Sending large files using Thunderbird, DropBox and UbuntuOne [Cloud Storage]
Guys,
What about sending large file > 50/100 MB via emails, its possible.
You can integrate your Thunderbird to store files on cloud and then actually the 20-30MB limit imposed by your server is gone.
How to do it:
Linux Users : Go to Edit -> Preferences -> Attachments -> Add
Mac Users : Find Preferences and rest is same!!
Now, you can add a Ubuntu One account there (be sure to register first at the Ubuntu One website, they give you 5GB free) or a DropBox account, which most people use.
For setting up Dropbox, go to Tools -> Add-ons and search for DropBox for FileLink and install it. After that's done, you can see these 2 accounts enabled in your Thunderbird. So, next time, whenever you want to send an attachment, at the bottom bar, it will ask you, if you want to send the file using FileLink or you can click the small down arrow next to the attach button showing you more options on how to attach the file. Select the appropriate account/storage suiting your needs.

Cheers
#thunderbird#mail#dropbox#ubuntu#ubuntuone#mac#largeattachments#bigfiles#mailattachments#thunderbirddropbox#cloudstorage#thunderbirdcloud#mailcloud#increasespace
0 notes
Text
Redmine, Github, embedabble Gists and Syntax Highlighting : Updates
Redmine, an open source content and project management system, is amazing. I am quite disorganized in storing my results and code. So, finally setted up a version control for my codes git/github and Redmine for the project management I also, updated mt blog code to include the syntax higlightings plus gists. Lets check it with an example
function foo(){ print("Hello Geeks") } foo()
Great so, it works. To go a step further, I enabled Gists for the blog, using a gist from here.
https://gist.github.com/soemarko/1395926
Super, so with all setup, finally its the time to get involved with blog writing. Cheers
0 notes
Text
Live Facebook Notifier on Desktop : Terminal+GeekLet+Conky
Hola!
I just submitted the facebook notifier geeklet, which gives you instant notification on your desktop when some does any activity on the facebook related to you. The link is here. I wont go into Geeklets, its a wizard to run scripts and display output directly on the screen/desktop and is made for mac. Ubuntu/Linux has the alternative named conky.
You just need the facefeed.py script and change the rss feed to your to receive updates instantly without opening the browser. You can change the size and frequency of notifications in the script. All the instructions are given in the link. Do give it a try. It looks like this
Have Fun
Sukhi
#facefeed#geeklets#conky#python#facebook#facelet#vanbug#sukhi#sukhdeepsingh#notification#facebooknotifier
5 notes
·
View notes
Text
[LINUX] removing files with special characters using rm
Many of you might have observed a problem with the removal of the files having a special characters in them. For instance, $1.txt, -bg etc.
# removing $1.txt
rm \$1.txt
# if its the only one file
rm *.txt
# removing -bg, its a little tricky as - denotes the parameter input for rm. So, from the manual we will use '- -' - - A - - signals the end of options and disables further option processing. Any arguments after the - - are treated as file-names and arguments. An argument of - equals to - -
rm - - -bg
For some other complex characters. you can always use grep match. Consider a file name; r.34$@
# command
rm `ls * | egrep '*@'`
I hope that helps.
Cheers
13 notes
·
View notes
Text
Facebook chat codes for pictures
Hey!
Facebook after developing the friends tagging using @, moved a step further to insert iconic images using a numeric code in [[ ]] format. Try [[zuck]], [[cocacola]]
Readers of website Reddit have taken the game a step further by creating new profiles with 'rage faces' as the profile picture and then listing the ID numbers. Here are some popular codes.
Troll face: [[171108522930776]] Me Gusta: [[211782832186415]] Cereal Guy: [[170815706323196]] LOL Face: [[168456309878025]] NO Guy: [[167359756658519]] Forever Alone: [[177903015598419]] Not Bad: [[NotBaad]] Challenge accepted: [[100002727365206]] Okay face: [[100002752520227]] Poker face: [[129627277060203]] Okay face: [[224812970902314]] Socially awkward penguin: [[98438140742]] Lamp: [[100001256102462]] No: [[167359756658519]] Feel like a sir: [[168040846586189]] Forever alone Christmas: [[125038607580286]] Problem? [[171108522930776]]
Then, a further ahead, people collected these codes to represent a big identity as a whole, like the following examples of mask etc.
Copy the codes and paste it on ur chat box, big images will appear based on the title.
!!! The codes are not working due to Tumblr html formatting, so better go to this link and grab the codes.
http://goo.gl/0yuV2
Have fun
Sukhdeep Singh
0 notes