The first image Frost constructs is visual (sight): the speaker stops "To watch [the owner's] woods fill up with snow" (line 4). We can imagine, based on this sensory description, what the scene looks like: the silent and darkened trees with the snow piling higher and higher around them, as though the forest could "fill up" (like a container) with snow.
The next image is visual (and perhaps also auditory) as well: the speaker describes this spot as secluded, "without a farmhouse near / Between the woods and frozen lake / The darkest evening of the year" (6-8). The night is very dark and very still because the narrator is the only person around and there is no ambient light from a farmhouse. Then, again, we see the woods he's described as well as the "frozen lake" (so it must also be very cold -- this could be considered tactile imagery).
The next image is auditory (hearing): "The only other sound's the sweep / Of easy wind and downy flake" (12). Thus it is really very quiet, with no human sounds at all, and all the narrator can hear is the gentle wind blowing the soft snowflakes around. Because he describes the snowflakes as "downy," we might also consider this a visual image (they are the fat and fluffy kinds of snowflakes) and/or a tactile (touch) one (they are soft and light and airy snowflakes).
Thus, Frost combines mostly visual imagery with some auditory and tactile images to achieve a very tranquil mood for the poem.