Revision history for ArrayPipe (use the Refresh button to see the latest version)

revision 1.7:
date: 2005/04/30 07:54:43

- changes:
  * changed swissprot file to uniprot file
  * added variables in preparation for DB backend
  * changed default link for program to 'localhost' (was www.pathogenomics.ca)
  * updated weblink for BIND data
  * changed name of RATIO_DIFF_TERM to 'r.o.r.' (from RATIO_DIFF)
  * specified minimum width of 450 pixels for chip visualizations
  * added function to calculate variation between replicates (within slide, between pairs, technical and biological replicates)
  * added function to merge pairs
  * activated function for calculating values after skipping of outliers (SOV)
  * made sure that functions request at least spot id term to be read in from data file
  * MA plot pictures can be made available as PDFs
  * for the array annotation a file can be uploaded
  * flag markers automatically flags spots that occur more than a specified number of times (defaults to 10)
  * changed orientation option for chip visualization from 'portrat' and 'landscape' to 'original' and 'rotated 90 ccw'
  * replaced technical replicate with the idea of pairs, which allows to match up dye-swaps, for example
  * separate routine for adding uniprot information

- bug fixes:
  * flagged spots were scaled differently in Z-score output


revision 1.6.1:
date: Fri Mar  4 14:26:17 PST 2005

- bug fixes:
  * program hung when run on localhost
  * stop of action due to high system load was not reported on output page
  * if no directory for saved sessions was specified, files in /tmp were listed
  
revision 1.6:
date: Tue Feb 22 10:43:20 PST 2005

- changes:
  * it is now possible to specify an alias for the input file(s); this allows to influence the order in which output is presented and also provides more meaningful descriptions in case of two-part data files, as in Imagene output
  * added forcing of TMEV generation for data that is only available as ratio or fold-change (the intensities are artificially created)
  * allowing regular expressions for flags, e.g. /^(-)/ for spot names starting with '(-)' (negative controls)
  * added more recognizers for spot name column and intensity columns (SpotName, gMeanSignal, rMeanSignal, gBGUsed, rBGUsed)
  * added one more field for list overlay
  * changed ending of archive with TMEV files from .tgz to .tar.gz for better indication of the nature of the file
  * replaced 'Probe ID' with variable $PROBE_ID_TERM
  * read probe ids from data column $PROBE_ID_TERM if available
  * added splitter for barplot/histogram (linearize multiple values in a row)
  * expanded histogram borders by 20% of range in each direction
  * increased number of filter fields to 8
  * added index update when using the spreadsheet
  * added skipping of lists
  * set default method for merging slides to median (the weighted mean will confuse as long as the t-tests are not adjusted to take weights into account!)
  * moved location of external programs into configuration file
  * added more explanations in ANOVA output
  * stricter regulation of channel annotation (so far it didn't matter if CH1 and CH2 were swapped around)
  * added annotation support for OCI H19kv6 files
  * made link to uniprot annotation file configurable
  * made path to annotation files configurable
  * changed 'public_username' to 'default_username'
  * changed parameter configuration mechanism to a more flexible hash structure
  * added system error messages to STDERR output
  * added mechanism to display current news and information
  * cleaned up error and information messages, now displayed when 'verbose' and/or 'debug' is set
  * avoid second call in background when BATCH processing from command line
  * enabled reading of data files in self-defined ArrayPipe format
  * improved recognition of GenePix files

- bug fixes:
  * avoid multiple occurrences of the same column in output
  * avoid empty ARRAY... column in output
  * spot id didn't come up if no annotation was available
  * older versions didn't incorporate file-specs
  * spreadsheet output files were overwritten
  * sometimes log-taking of 0 was attempted
  * give full permissions to new user directory
  * changed invalid values for TMEV from '1' to '0'
  * skip values of '0' when duplicate spots are merged
  * flag spots as 'undefined' if one channel has zero foreground intensity
  * too many dollar signes in $$cluster_nodes
  * use new file-handle for reading of annotation file (some other file-handle must be open and this led to the first line being skipped)
  * load in new TMEV format
  * dealing with duplicate headers failed for Windows files
  * empty trailing fields in web-based spreadsheet didn't show up properly
  * unloading of lists didn't work
  * flags from lists hid flags from data file
  * report error properly if second file for an ImaGene-type data set wasn't found
  * avoided browser caching problem that led to loading of the same spreadsheet again and again
  * merging of duplicates skipped spotes with negative log-ratio
  * name of merged output file wasn't reported
  * wrong values were returned for t-test where mean == pop_mean and stderr == 0
  * endless loop with certain types of GenePix input
  * number of rows and columns were changed for one array type (12,4,17,16 layout)


revision 1.4:
date: 2004/10/13 21:17:00
- changes:
  * changed RI plot to MA plot - this seems to be the more commonly used plot
  * set default for Welch t-test (within group) to work on intensities (if 
    this was set to ratios it would act exactly like a Student t-test)
  * enable reading of data and annotation from previously saved ArrayPipe files (allows 
    saving of normalized data and channel annotation)
  * added lines indicating 2-fold change to MA and scatter plots
  * adjust size of text boxes to channel annotation
  * made long listing of spot info for print-tip box plot optional 
    (switched off by default)
  * set default for print-tip box plots to 'ratios' (not 'auto' which 
    sometimes prints both channels beside each other)
  * changed 'Set cutoffs' to 'Set cutoffs (data shift)' to make the name 
    it bit more explanatory
  * added loess functions (printtip and global) from Bioconductor's limma 
    package, which is faster and more robust (but yields the same results 
    as the loess from the marrayNorm package)
  * added function that allows changing one or more flags to another
  * added loading capability of TMEV files (e.g. from TIGR's Spotfinder)
  * avoid excessive skipping of spots if one or more values are missing 
    but there is still sufficient data for t-tests available
  * added weighting averaging (and made it default) for merging of 
    replicates
  * added weights used in averaging as an output column
  * list merging of technical replicates before merging of all replicates 
    in the module list (more logical order this way)
  * report the number of valid entries that go into p-value calculation 
    (within groups only)

- bug fixes:
  * overlay list with only one column didn't work for windows files
  * t-test between groups allowed too many NA's
  * round numbers in scientific notation as well (spreadsheet)
  * rename headers with multiple occurences in spreadsheet output
  * set flagging information for 'a' (absent) to 'automatic' and handle 
    multiple instances properly
  * set value of flagged/undefined entries in MEV output file to '0', 
    which will make spot show up as a grey box if present in both channels
  * set list size for some file selection boxes
  * prevented over-flagging in normalization
  * fixed problem with colouring flagged spots when multiple MA-plots follow 
    each other
  * the function 'Signal box plot (printTip)' calculated log-transferred ratios the wrong way: instead of sub-tracting the log-transferred intensities, these were accidentally divided. This resulted in wrong box plots and in some cases also in wrong ratios in the results output.
  * empty fields at end of input row were skipped and caused warning 
    (insert empty elements instead)
  * name of merged output file wasn't reported
  * files with special characters such as brackets caused problems in 
    spreadsheet functionality
  * y-coordinates of quartiles in MA plots were sometimes wrong
  * rounding error sometimes caused problems with calculation of standard 
    deviation
  * fixed problem with merging of files (attempt of taking log of 
    negative values that are log-transferred already)

revision 1.2
date: Tue Jul 27 16:15:05 PDT 2004
- changes:
 * added Student, Wilcox and Welch for tests between groups
 * distinction between files with channel label 'C|T1' and 'C|T2'
   (beforehand the control channels needed to be different)
 * default channel labelling is now 'C' and 'T' (instead of 'T1')
 * new 'Extra tool' in spreadsheet: add column calculated from existing ones
 * (e.g. to calculate fold-change between two conditions)
 * removed 'Save List' feature from spreadsheet (it was a bit confusing
   and not that useful)
 * included counter 'i' if probe coordinates were selected
 
- bug fixes:
 * links to MEV archive corrected
 * inclusion of p-anova in merged files,
 * corrected missing entries in spreadsheet if the first of a set of files
   doesn't contain values for a spot
 * fixed bug that messed up spreadsheet output when TIGR format has been
 * selected for output as well

revision 1.0
date: 2004/07/07 19:26:20;  author: khokamp;  state: Exp;  lines: +6370 -1960
- added columns with gene names and gene descriptions to the front of the table
- add ratio column
- added spot and probe ids to hidden fields in spreadsheet output
- added significance values to simple spreadsheet output
- added inverse pattern match for first filter field (because 'BUTNOT' connector is not available for that one)
- added list upload for extra filtering in spreadsheet
- upgraded flagging of flawed duplicates: allows to flag spots with more than x-fold standard deviation from median absolute or fold difference
- fixed sorting bug in spreadsheet (if values in scientific notation were encountered the sorting mode changed erroneously to alphabetical instead of numeric
al sorting)
- change header of file if multiple columns with the same header are found
- keep name of output files slightly simpler
- avoid ugly empty cells in the last column of a spreadsheet table
- added a box for the hidden columns in a spreadsheet
- fixed bug that didn't report file problems if all files were faulty
- add name of file to spreadsheet title if only one file is shown
- avoid long endings to file names in spreadsheet
- changed default of number of sample output lines to 0, in which case a link to a file with 40 sample lines is provided
- added defaults for p-values to the p-<name> values
- changed default of set cutoffs from global to individual shift
- fixed size of new list in spreadsheet output
- remove temporary files if $clean_up_files is set
- exclude parameters from output that have not been used
- skip writing complete output to huge file
- remove skip_empty option in output module, because it is buggy and probably not being used
- reduced time of merging process for spreadsheet
- had to put step from ArrayPipe to Spreadsheet to the background because it could take longer than the 5 minute time-out threshold if many large files are to be dealt with
- added 'ignore sign' box to filter to work on absolute numbers
- changed expire from +1y to -1s for each header to avoid caching of pages
- fixed bug with storing modified spot id (number in brackets need to be cut off so that flags are recognized)
- fixed bug with annotation of JB 22x22 slides (spot id was processed too early and probe id wasn't accessible anymore)
- fixed bug with extra columns in TIGR annotation file
- deletion of uploaded archives after extracting of files
- gave merged ratios priority over individual ratios in filter-by-value
- adjustments for new ProbeLynx headers
- added merging of background values when merging duplicated spots
- started work on hiding settings
- started work on moving output into separate pages
----------------------------
revision 0.96
date: 2004/05/26 19:29:31;  author: khokamp;  state: Exp;  lines: +150 -16
- fixed merging of replicates: so far the intensity values have been merged and ratios were taken afterwards; now the ratios of a merged file are calculated as the median of the ratios from the individual files; this should give more accurate values if files have not been normalized globally.
----------------------------
revision 0.95
date: 2004/05/26 17:37:25;  author: khokamp;  state: Exp;  lines: +65 -46
- In the previous versions the values in the fold-change column were sometimes derived from the raw foreground intensities, instead of normalized and merged values. This has been fixed now.
----------------------------
revision 0.94
date: 2004/05/21 01:17:54;  author: khokamp;  state: Exp;  lines: +23 -12
- added markers for JB 21K slide
- added gzip'ed tar archive for MEV files
----------------------------
revision 0.93
date: 2004/05/20 21:00:37;  author: khokamp;  state: Exp;  lines: +238 -53
- fold-change column has been added to the output
- annotation for JackBell 21K slides added
----------------------------
revision 0.92
date: 2004/05/20 13:32:54;  author: khokamp;  state: Exp;  lines: +832 -82
- ANOVA added
- bug fix in permutation program
- small bug fixes
----------------------------
revision 0.91
date: 2004/05/20 13:32:02;  author: khokamp;  state: Exp;
- first stored version of public release