Introduction RAWdataR
Thomas Gredig
12/16/2019
RAWdataR.RmdInspect Directory
We can find out about projects and users
for a particular directory by inspecting it.
path.RAW = raw.getSamplePath()
raw.inspectFolder(path.RAW)
#> $hasSubfolders
#> [1] FALSE
#> 
#> $projects
#> [1] "Dual,FePc,Optical"
#> 
#> $users
#> [1] "AN,MM,TG"
#> 
#> $samples
#> [1] "20160718,NM20151109CuPc100gl150-A2(UDS).txt,Tb33Al66"
#> 
#> $instruments
#> [1] "mm160622si1,spectrophotometer,vsm"
#> 
#> $validFiles
#> [1] 3
#> 
#> $invalidFiles
#> [1] 4One of the instruments is off, let us inspect which file that is:
basename(raw.findFiles(path.RAW, instrument='MM160622SI1'))
#> [1] "20160718_Dual_MM_MM160622SI1_20160718_2.5K_MVSH_00001.txt"The data filename is missing an instrument. It appears to be a
vsm data file. The instrument can be added as follows:
f = basename(raw.findFiles(path.RAW, instrument='MM160622SI1'))
f.new = basename(raw.fixInvalidFile(f, addInstrument='vsm'))
print(paste("Change:",f,"to:",f.new))
#> [1] "Change: 20160718_Dual_MM_MM160622SI1_20160718_2.5K_MVSH_00001.txt to: 20160718_Dual_MM_vsm_MM160622SI1_20160718_2.5K_MVSH_00001.txt"Find duplicate RAW data files
path.RAW = raw.getSamplePath()
f = raw.findDuplicates(path.RAW,isValid=FALSE)
d1 = data.frame(f=basename(f),md5=raw.getMD5(f),row.names = c())
kable(d1, col.names = c('Duplicated Files','MD5'))| Duplicated Files | MD5 | 
|---|---|
| 20160607_Optical_AN_Spectrophotometer_NM20151109CuPc100gl150-A2(UDS).txt | 7545cd | 
| not-a-RAW-file.txt | 7545cd | 
This is the list of duplicated files. We can find out what the original file:
searchMD5 = d1$md5[!duplicated(d1$md5)]
for(m in searchMD5) {
  print(paste("Duplicates for files with MD5 = ",m))
  f = raw.findFiles(path.RAW, md5=m)
  print(basename(f))
}
#> [1] "Duplicates for files with MD5 =  7545cd"
#> [1] "20141215_FePc_TG_vsm_Tb33Al66_20141215-Tb33Al66-ac1Oe-20Hz.txt"          
#> [2] "20160607_Optical_AN_Spectrophotometer_NM20151109CuPc100gl150-A2(UDS).txt"