We give a sound and complete procedure for fence insertion for concurrent finite-state programs running under the PSO memory model. This model allows “write to read” and “write-to-write” relaxations corresponding to the addition of an unbounded store buffers between processors and the main memory. We introduce a novel machine model, called the Hierarchical Single-Buffer (HSB) semantics, and show that the reachability problem for a program under PSO can be reduced to the reachability problem under HSB. We present a simple and effective backward reachability analysis algorithm for the latter, and propose a counter-example guided fence insertion procedure. The procedure infers automatically a minimal set of fences that ensures correctness of the program. We have implemented a prototype and run it successfully on all standard benchmarks, together with several challenging examples.