How to find a natural grouping of a real data set with an unknown number of clusters which may be contaminated with noise? The basic idea of most of my recently published approaches is to link clustering with data compression. If the data contains distinct clusters, given a suitable model, these patterns allow for effective data compression. The more pronounced the clusters are and the better the model fits the data structure, the more effective we can compress the data. The first part of this talk highlights the benefits of clustering in neuroscience applications: by identifying interaction patterns among brain regions from fMRI data and detecting the major fiber bundles in DTI data our methods contribute to a better understanding of the onset and progression of psychiatric and neurodegenerative diseases such as Alzheimer and Somatoform Pain Disorder. A short survey on my further research areas and my plans for future research and teaching concludes the talk.