Abstract: Nonlinear stochastic optimization problems arise in a wide range of applications, from acoustic/geophysical inversion to deep learning. The scale, computational cost, and difficulty of these models make classical optimization techniques impractical. To address these challenges, we have developed new optimization methods that, in addition, are well suited for distributed computing implementations. Our techniques employ adaptive sampling strategies that gradually increase the accuracy in the step computation in order to achieve efficiency and scalability and incorporate second-order infor mation by exploiting the stochastic nature of the problem. We provide some interesting (and perhaps surprising) complexity results for our methods. The performance of our algorithm is illustrated on large-scale machine learning models, both convex and non-convex. We conclude by highlighting some open questions that arise when training deep neural networks.