Humans have an impressive, automatic capacity for identifying and organizing sounds in their environment. However, little is known about the timescales that sound identification functions on, or the acoustic features that listeners use to identify auditory objects. To better understand the temporal and acoustic dynamics of sound category identification, two go/no-go perceptual gating studies were conducted. Participants heard speech, musical instrument, and human-environmental sounds ranging from 12.5 to 200 ms in duration. Listeners could reliably identify sound categories with just 25 ms of duration. In experiment 1, participants’ performance on instrument sounds showed a distinct processing advantage at shorter durations. Experiment 2 revealed that this advantage was largely dependent on regularities in instrument onset characteristics relative to the spectrotemporal complexity of environmental sounds and speech. Models of participant responses indicated that listeners used spectral, temporal, noise, and pitch cues in the task. Aspects of spectral centroid were associated with responses for all categories, while noisiness and spectral flatness were associated with environmental and instrument responses, respectively. Responses for speech and environmental sounds were also associated with spectral features that varied over time. Experiment 2 indicated that variability in fundamental frequency was useful in identifying steady state speech and instrument stimuli