


     SSSSOOOOXXXX((((1111))))                        MMMMUUUUNNNNIIIIXXXX                        SSSSOOOOXXXX((((1111))))



     NNNNAAAAMMMMEEEE
          sox - SOund eXchange - universal sound sample translator

     SSSSYYYYNNNNOOOOPPPPSSSSIIIISSSS
          ssssooooxxxx _i_n_f_i_l_e _o_u_t_f_i_l_e
          ssssooooxxxx _i_n_f_i_l_e _o_u_t_f_i_l_e [[[[ _e_f_f_e_c_t [[[[ _e_f_f_e_c_t _o_p_t_i_o_n_s ... ]]]] ]]]]
          ssssooooxxxx _i_n_f_i_l_e ----eeee _e_f_f_e_c_t [[[[ _e_f_f_e_c_t _o_p_t_i_o_n_s ... ]]]]
          ssssooooxxxx [ _g_e_n_e_r_a_l _o_p_t_i_o_n_s  ]]]] [[[[ _f_o_r_m_a_t _o_p_t_i_o_n_s  ]]]] _i_f_i_l_e [[[[ _f_o_r_m_a_t
          _o_p_t_i_o_n_s  ]]]] _o_f_i_l_e [[[[ _e_f_f_e_c_t [ _e_f_f_e_c_t _o_p_t_i_o_n_s ... ]]]] ]]]]
          _G_e_n_e_r_a_l _o_p_t_i_o_n_s:  [[[[ ----VVVV ]]]] [[[[ ----vvvv _v_o_l_u_m_e ]]]]
          _F_o_r_m_a_t _o_p_t_i_o_n_s:  [[[[ ----tttt _f_i_l_e_t_y_p_e ]]]] [[[[ ----rrrr _r_a_t_e ]]]] [[[[ ----ssss////----uuuu//
//----UUUU////----AAAA ]]]]
          [[[[ ----bbbb////----wwww////----llll////----ffff////----dddd////----DDDD ]]]] [[[[ ----cccc _c_h_a_n_n_e_l_s ]]]
] [[[[ ----xxxx ]]]]
          _E_f_f_e_c_t_s:
               ccccooooppppyyyy
               rrrraaaatttteeee
               aaaavvvvgggg
               ssssttttaaaatttt
               eeeecccchhhhoooo _d_e_l_a_y _v_o_l_u_m_e [[[[ _d_e_l_a_y _v_o_l_u_m_e ... ]]]]
               vvvviiiibbbbrrrroooo _s_p_e_e_d [[[[ _d_e_p_t_h ]]]]
               lllloooowwwwpppp _c_e_n_t_e_r
               bbbbaaaannnndddd [[[[ -_n ]]]] _c_e_n_t_e_r [[[[ _w_i_d_t_h ]]]]

     DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
          _S_o_x translates sound files from one format to another,
          possibly doing a sound effect.

     OOOOPPPPTTTTIIIIOOOONNNNSSSS
          The option syntax is a little grotty, but in essence:
               sox file.au file.voc
          translates a sound sample in SUN Sparc .AU format into a
          SoundBlaster .VOC file, while
               sox -v 0.5 file.au -rate 12000 file.voc rate
          does the same format translation but also lowers the
          amplitude by 1/2 and changes the sampling rate from 8000
          hertz to 12000 hertz via the rrrraaaatttteeee _s_o_u_n_d _e_f_f_e_c_t loop.

          File type options:

          ----tttt _f_i_l_e_t_y_p_e
                    gives the type of the sound sample file.

          ----rrrr _r_a_t_e   Give sample rate in Hertz of file.

          ----ssss////----uuuu////----UUUU////----AAAA
                    The sample data is signed linear (2's complement),
                    unsigned linear, U-law (logarithmic), or A-law
                    (logarithmic).  U-law and A-law are the U.S. and
                    international standards for logarithmic telephone
                    sound compression.

          ----bbbb////----wwww////----llll////----ffff////----dddd////----DDDD
                    The sample data is in bytes, 16-bit words, 32-bit



     Page 1                                          (printed 1/28/93)






     SSSSOOOOXXXX((((1111))))                        MMMMUUUUNNNNIIIIXXXX                        SSSSOOOOXXXX((((1111))))



                    longwords, 32-bit floats, 64-bit double floats, or
                    80-bit IEEE floats.  Floats and double floats are
                    in native machine format.

          ----xxxx        The sample data is in XINU format; that is, it
                    comes from a machine with the opposite word order
                    than yours and must be swapped according to the
                    word-size given above.  Only 16-bit and 32-bit
                    integer data may be swapped.  Machine-format
                    floating-point data is not portable.  IEEE floats
                    are a fixed, portable format. ???

          ----cccc _c_h_a_n_n_e_l_s
                    The number of sound channels in the data file.
                    This may be 1, 2, or 4; for mono, stereo, or quad
                    sound data.

          General options:

          ----eeee        after the input file allows you to avoid giving an
                    output file and just name an effect.  This is only
                    useful with the ssssttttaaaatttt effect.

          ----vvvv _v_o_l_u_m_e Change amplitude (floating point); less than 1.0
                    decreases, greater than 1.0 increases.  Note: we
                    perceive volume logarithmically, not linearly.
                    Note: see the ssssttttaaaatttt effect.

          ----VVVV        Print a description of processing phases.  Useful
                    for figuring out exactly how _s_o_x is mangling your
                    sound samples.

          The input and output files may be standard input and output.
          This is specified by '-'.  The ----tttt _t_y_p_e option must be given
          in this case, else _s_o_x will not know the format of the given
          file.  The ----tttt,,,, ----rrrr,,,, ----ssss////----uuuu////----UUUU////----AAAA,,,, ----bbbb////----wwww////----l
lll////----ffff////----dddd////----DDDD and ----xxxx
          options refer to the input data when given before the input
          file name.  After, they refer to the output data.

          If you don't give an output file name, _s_o_x will just read
          the input file.  This is useful for validating structured
          file formats; the ssssttttaaaatttt effect may also be used via the ----eeee
          option.

     FFFFIIIILLLLEEEE TTTTYYYYPPPPEEEESSSS
          _S_o_x needs to know the formats of the input and output files.
          File formats which have headers are checked, if that header
          doesn't seem right, the program exits with an appropriate
          message.  Currently, the raw (no header), IRCAM Sound Files,
          Sound Blaster, SPARC .AU (w/header), Mac HCOM, PC/DOS .SOU,
          Sndtool, and Sounder, NeXT .SND, Windows 3.1 RIFF/WAV, and
          Amiga/SGI AIFF and 8SVX formats are supported.



     Page 2                                          (printed 1/28/93)






     SSSSOOOOXXXX((((1111))))                        MMMMUUUUNNNNIIIIXXXX                        SSSSOOOOXXXX((((1111))))



          ....aaaaiiiiffffffff     AIFF files used on Amiga and SGI.  Note: the AIFF
                    format supports only one SSND chunk.  It does not
                    support multiple sound chunks, or the 8SVX musical
                    instrument description format.  AIFF files are
                    multimedia archives and and can have multiple
                    audio and picture chunks.  You may need a separate
                    archiver to work with them.

          ....aaaauuuu       SUN Microsystems AU files.  There are apparently
                    many types of .au files; DEC has invented its own
                    with a different magic number and word order. The
                    .au handler can read these files but will not
                    write them.  Some .au files have valid AU headers
                    and some do not.  The latter are probably original
                    SUN u-law 8000 hz samples.  These can be dealt
                    with using the ....uuuullll format (see below).

          ....hhhhccccoooommmm     Macintosh HCOM files.  These are (apparently) Mac
                    FSSD files with some variant of Huffman
                    compression.  The Macintosh has wacky file formats
                    and this format handler apparently doesn't handle
                    all the ones it should.  Mac users will need your
                    usual arsenal of file converters to deal with an
                    HCOM file under Unix or DOS.

          ....rrrraaaawwww      Raw files (no header).
                    The sample rate, size (byte, word, etc), and style
                    (signed, unsigned, etc.) of the sample file must
                    be given.  The number of channels defaults to 1.

          ....uuuubbbb,,,, ....ssssbbbb,,,, ....uuuuwwww,,,, ....sssswwww,,,, ....uuuullll
                    These are several suffices which serve as a
                    shorthand for raw files with a given size and
                    style.  Thus, uuuubbbb,,,, ssssbbbb,,,, uuuuwwww,,,, sssswwww,,,, and uuuullll correspond to
                    "unsigned byte", "signed byte", "unsigned word",
                    "signed word", and "ulaw" (byte).  The sample rate
                    defaults to 8000 hz if not explicitly set, and the
                    number of channels (as always) defaults to 1.
                    There are lots of Sparc samples floating around in
                    u-law format with no header and fixed at a sample
                    rate of 8000 hz.  (Certain sound management
                    software cheerfully ignores the headers.)
                    Similarly, most Mac sound files are in unsigned
                    byte format with a sample rate of 11025 or 22050
                    hz.

          ....ssssffff       IRCAM Sound Files.
                    SoundFiles are used by academic music software
                    such as the CSound package, and the MixView sound
                    sample editor.

          ....vvvvoooocccc      Sound Blaster VOC files.



     Page 3                                          (printed 1/28/93)






     SSSSOOOOXXXX((((1111))))                        MMMMUUUUNNNNIIIIXXXX                        SSSSOOOOXXXX((((1111))))



                    VOC files are multi-part and contain silence
                    parts, looping, and different sample rates for
                    different chunks.  On input, the silence parts are
                    filled out, loops are rejected, and sample data
                    with a new sample rate is rejected.  Silence with
                    a different sample rate is generated
                    appropriately.  On output, silence is not
                    detected, nor are impossible sample rates.

          ....wwwwaaaavvvv      Windows 3.1 .WAV RIFF files.
                    These appear to be very similar to IFF files, but
                    not the same. They are the native sound file
                    format of Windows 3.1.  Obviously, Windows 3.1 is
                    of such incredible importance to the computer
                    industry that it just had to have its own sound
                    file format.

     EEEEFFFFFFFFEEEECCCCTTTTSSSS
          Only one effect from the palette may be applied to a sound
          sample.  To do multiple effects you'll need to run _s_o_x in a
          pipeline.

          copy                          Copy the input file to the
                                        output file.  This is the
                                        default effect if both files
                                        have the same sampling rate,
                                        or the rates are "close".

          rate                          Translate input sampling rate
                                        to output sampling rate via
                                        linear interpolation to the
                                        Least Common Multiple of the
                                        two sampling rates.  This is
                                        the default effect if the two
                                        files have different sampling
                                        rates.  This is fast but
                                        noisy.

          avg                           Mix 4- or 2-channel sound file
                                        into 2- or 1-channel file by
                                        averaging the samples for
                                        different speakers.

          stat                          Do a statistical check on the
                                        input file, and print results
                                        on the standard error file.
                                        ssssttttaaaatttt may copy the file
                                        untouched from input to
                                        output, if you select an
                                        output file. The "Volume
                                        Adjustment:" field in the
                                        statistics gives you the



     Page 4                                          (printed 1/28/93)






     SSSSOOOOXXXX((((1111))))                        MMMMUUUUNNNNIIIIXXXX                        SSSSOOOOXXXX((((1111))))



                                        argument to the ----vvvv _n_u_m_b_e_r
                                        which will make the sample as
                                        loud as possible.

          echo [ _d_e_l_a_y _v_o_l_u_m_e ...  ]]]]    Add echoing to a sound sample.
                                        Each delay/volume pair gives
                                        the delay in seconds and the
                                        volume (relative to 1.0) of
                                        that echo.  If the volumes add
                                        up to more than 1.0, the sound
                                        will melt down instead of
                                        fading away.

          vibro _s_p_e_e_d  [[[[ _d_e_p_t_h ]]]]        Add the world-famous Fender
                                        Vibro-Champ sound effect to a
                                        sound sample by using a sine
                                        wave as the volume knob.
                                        SSSSppppeeeeeeeedddd gives the Hertz value of
                                        the wave.  This must be under
                                        30.  DDDDeeeepppptttthhhh gives the amount
                                        the volume is cut into by the
                                        sine wave, ranging 0.0 to 1.0
                                        and defaulting to 0.5.

          lowp _c_e_n_t_e_r                   Apply a low-pass filter.  The
                                        frequency response drops
                                        logarithmically with _c_e_n_t_e_r
                                        frequency in the middle of the
                                        drop.  The slope of the filter
                                        is quite gentle.

          band [[[[ -_n ]]]] _c_e_n_t_e_r [[[[ _w_i_d_t_h ]]]]  Apply a band-pass filter.  The
                                        frequency response drops
                                        logarithmically around the
                                        _c_e_n_t_e_r frequency.  The _w_i_d_t_h
                                        gives the slope of the drop.
                                        The frequencies at _c_e_n_t_e_r +
                                        _w_i_d_t_h and _c_e_n_t_e_r - _w_i_d_t_h will
                                        be half of their original
                                        amplitudes.  BBBBaaaannnndddd defaults to
                                        a mode oriented to pitched
                                        signals, i.e. voice, singing,
                                        or instrumental music.  The -_n
                                        (for noise) option uses the
                                        alternate mode for un-pitched
                                        signals.  BBBBaaaannnndddd introduces
                                        noise in the shape of the
                                        filter, i.e. peaking at the
                                        _c_e_n_t_e_r frequency and settling
                                        around it.

          _S_o_x enforces certain effects.  If the two files have



     Page 5                                          (printed 1/28/93)






     SSSSOOOOXXXX((((1111))))                        MMMMUUUUNNNNIIIIXXXX                        SSSSOOOOXXXX((((1111))))



          different sampling rates, the requested effect must be one
          of ccccooooppppyyyy,,,, or rrrraaaatttteeee,,,, If the two files have different numbers of
          channels, the aaaavvvvgggg effect must be requested.

     BBBBUUUUGGGGSSSS
          The syntax is horrific.  It's very tempting to include a
          default system that allows an effect name as the program
          name and just pipes a sound sample from standard input to
          standard output, but the problem of inputting the sample
          rates makes this unworkable.

     FFFFIIIILLLLEEEESSSS
     SSSSEEEEEEEE AAAALLLLSSSSOOOO
     NNNNOOOOTTTTIIIICCCCEEEESSSS
          The echoplex effect is:
              Copyright (C) 1989 by Jef Poskanzer.
              Permission to use, copy, modify, and distribute this
          software and its
              documentation for any purpose and without fee is hereby
          granted, provided
              that the above copyright notice appear in all copies and
          that both that
              copyright notice and this permission notice appear in
          supporting
              documentation.  This software is provided "as is"
          without express or
              implied warranty.




























     Page 6                                          (printed 1/28/93)



