The development of humanoid robots is one of the most challenging research fields within robotics. One of the crucial capabilities of such a humanoid is the ability to visually perceive its environment. The present monograph deals with visual perception for the intended applications manipulation and imitation, supporting higher-level cognition. In particular, stereo-based methods and systems for object recognition and 6 DoF pose estimation as well as for markerless human motion capture are presented. After an extensive presentation of the state of the art in these areas, three real-time systems that have been developed by the author are presented in great detail: object recognition and pose estimation for textured and for single-colored objects, and a markerless human motion capture system. As only sensor a stereo camera system is used. All experiments have been performed using the humanoid robot ARMAR-III. The systems presented in this monograph are successfully applied for various research activities in the context of humanoid robotics at the University of Karlsruhe, including manipulation, imitation, visual serving, motion planning, and higher-level planning.