Introduction RAWdataR
Thomas Gredig
12/16/2019
RAWdataR.Rmd
Inspect Directory
We can find out about projects
and users
for a particular directory by inspecting it.
path.RAW = raw.getSamplePath()
raw.inspectFolder(path.RAW)
#> $hasSubfolders
#> [1] FALSE
#>
#> $projects
#> [1] "Dual,FePc,Optical"
#>
#> $users
#> [1] "AN,MM,TG"
#>
#> $samples
#> [1] "20160718,NM20151109CuPc100gl150-A2(UDS).txt,Tb33Al66"
#>
#> $instruments
#> [1] "mm160622si1,spectrophotometer,vsm"
#>
#> $validFiles
#> [1] 3
#>
#> $invalidFiles
#> [1] 4
One of the instruments is off, let us inspect which file that is:
basename(raw.findFiles(path.RAW, instrument='MM160622SI1'))
#> [1] "20160718_Dual_MM_MM160622SI1_20160718_2.5K_MVSH_00001.txt"
The data filename is missing an instrument. It appears to be a
vsm
data file. The instrument can be added as follows:
f = basename(raw.findFiles(path.RAW, instrument='MM160622SI1'))
f.new = basename(raw.fixInvalidFile(f, addInstrument='vsm'))
print(paste("Change:",f,"to:",f.new))
#> [1] "Change: 20160718_Dual_MM_MM160622SI1_20160718_2.5K_MVSH_00001.txt to: 20160718_Dual_MM_vsm_MM160622SI1_20160718_2.5K_MVSH_00001.txt"
Find duplicate RAW data files
path.RAW = raw.getSamplePath()
f = raw.findDuplicates(path.RAW,isValid=FALSE)
d1 = data.frame(f=basename(f),md5=raw.getMD5(f),row.names = c())
kable(d1, col.names = c('Duplicated Files','MD5'))
Duplicated Files | MD5 |
---|---|
20160607_Optical_AN_Spectrophotometer_NM20151109CuPc100gl150-A2(UDS).txt | 7545cd |
not-a-RAW-file.txt | 7545cd |
This is the list of duplicated files. We can find out what the original file:
searchMD5 = d1$md5[!duplicated(d1$md5)]
for(m in searchMD5) {
print(paste("Duplicates for files with MD5 = ",m))
f = raw.findFiles(path.RAW, md5=m)
print(basename(f))
}
#> [1] "Duplicates for files with MD5 = 7545cd"
#> [1] "20141215_FePc_TG_vsm_Tb33Al66_20141215-Tb33Al66-ac1Oe-20Hz.txt"
#> [2] "20160607_Optical_AN_Spectrophotometer_NM20151109CuPc100gl150-A2(UDS).txt"