Load forecasting has been deeply studied because of its critical role in Smart Grid. In current Smart Grid, there are various types of customers with different energy consumption patterns. Customer’s energy consumption patterns are referred to as customer behaviors. It would significantly benefit load forecasting in a grid if customer behaviors could be taken into account. This paper proposes an innovative method that aggregates different types of customers by their identified behaviors, and then predicts the load of each customer cluster, so as to improve load forecasting accuracy of the whole grid. Sparse Continuous Conditional Random Fields (sCCRF) is proposed to effectively identify different customer behaviors through learning. A hierarchical clustering process is then introduced to aggregate customers according to the identified behaviors. Within each customer cluster, a representative sCCRF is fine-tuned to predict the load of its cluster. The final load of the whole grid is obtained by summing the loads of each cluster. The proposed method for load forecasting in Smart Grid has two major advantages. 1) Learning customer behaviors not only improves the prediction accuracy but also has a low computational cost. 2) sCCRF can effectively model the load forecasting problem of one customer, and simultaneously select key features to identify its energy consumption pattern. Experiments conducted from different perspectives demonstrate the advantages of the proposed load forecasting method. Further discussion is provided, indicating that the approach of learning customer behaviors can be extended as a general framework to facilitate decision making in other market domains.