Experimenting with a new LLM: QWEN

Mar 4

We have established an Alibaba Cloud account to evaluate their model offerings. Initial impressions are consistent with expectations: their models trail the frontier but perform comparably to Llama 3.1 and similar large open-source models — perhaps six to twelve months behind the leading providers. This week I plan to run direct comparisons against our existing Cognosa use cases, which is straightforward given the platform's ability to route identical queries across models and compare outputs side by side.

I have reduced the frequency of these comparisons, however, for two reasons. First, my daily satisfaction with Claude continues to increase. Second, ongoing disappointment with OpenAI — not only in coding tasks but across technical domains including DNS, Docker Compose, and infrastructure troubleshooting — reduces my incentive to benchmark regularly. Gemini has emerged as my preferred backup assistant and likely outperforms ChatGPT in most of my workflows, though I continue to alternate between them. I have also discontinued running large open-source models locally on my Mac Studio: inference is too slow, output quality insufficient, and my API spend across frontier providers remains well below any meaningful optimization threshold — at least for now.

Harry Layman https://www.cogwritest.com/

Experimenting with a new LLM: QWEN

CogWrite Semantic Technologies

AI Adoption, Strategy and Prototyping

Location

Contact

Experimenting with a new LLM: QWEN

The Next Iteration of RAG ?

Semantic Chunking

CogWrite Semantic Technologies

AI Adoption, Strategy and Prototyping

Location

Contact