Humans can learn to perform multiple tasks in succession over the lifespan (“continual” learning), whereas current machine learning systems fail. Here, we investigated the cognitive mechanisms that permit successful continual learning in humans and harnessed our behavioral findings for neural network design. Humans categorized naturalistic images of trees according to one of two orthogonal task rules that were learned by trial and error. Training regimes that focused on individual rules for prolonged periods (blocked training) improved human performance on a later test involving randomly interleaved rules, compared with control regimes that trained in an interleaved fashion. Analysis of human error patterns suggested that blocked training encouraged humans to form “factorized” representation that optimally segregated the tasks, especially for those individuals with a strong prior bias to represent the stimulus space in a well-structured way. By contrast, standard supervised deep neural networks trained on the same tasks suffered catastrophic forgetting under blocked training, due to representational interference in the deeper layers. However, augmenting deep networks with an unsupervised generative model that allowed it to first learn a good embedding of the stimulus space (similar to that observed in humans) reduced catastrophic forgetting under blocked training. Building artificial agents that first learn a model of the world may be one promising route to solving continual task performance in artificial intelligence research.