zhiminy commited on
Commit
0f07314
·
1 Parent(s): da05d30

use 20b as guardrail

Browse files
Files changed (2) hide show
  1. README.md +1 -0
  2. app.py +2 -2
README.md CHANGED
@@ -25,6 +25,7 @@ Welcome to **SWE-Model-Arena**, an open-source platform designed for evaluating
25
  - Community detection: Newman modularity score
26
  - Consistency score: Quantify model determinism and reliability through self-play matches
27
  - **Transparent, Open-Source Leaderboard**: View real-time model rankings across diverse SE workflows with full transparency.
 
28
 
29
  ## Why SWE-Model-Arena?
30
 
 
25
  - Community detection: Newman modularity score
26
  - Consistency score: Quantify model determinism and reliability through self-play matches
27
  - **Transparent, Open-Source Leaderboard**: View real-time model rankings across diverse SE workflows with full transparency.
28
+ - **Intelligent Request Filtering**: Employs `GPT-OSS-20B` as a guardrail to automatically filter out non-software-engineering-related requests, ensuring focused and relevant evaluations.
29
 
30
  ## Why SWE-Model-Arena?
31
 
app.py CHANGED
@@ -866,7 +866,7 @@ with gr.Blocks(js=clickable_links_js) as app:
866
 
867
  def guardrail_check_se_relevance(user_input):
868
  """
869
- Use gpt-5-nano to check if the user input is SE-related.
870
  Return True if it is SE-related, otherwise False.
871
  """
872
  # Example instructions for classification — adjust to your needs
@@ -883,7 +883,7 @@ with gr.Blocks(js=clickable_links_js) as app:
883
  try:
884
  # Make the chat completion call
885
  response = openai_client.chat.completions.create(
886
- model="gpt-5-nano", messages=[system_message, user_message]
887
  )
888
  classification = response.choices[0].message.content.strip().lower()
889
  # Check if the LLM responded with 'Yes'
 
866
 
867
  def guardrail_check_se_relevance(user_input):
868
  """
869
+ Use gpt-oss-20b to check if the user input is SE-related.
870
  Return True if it is SE-related, otherwise False.
871
  """
872
  # Example instructions for classification — adjust to your needs
 
883
  try:
884
  # Make the chat completion call
885
  response = openai_client.chat.completions.create(
886
+ model="gpt-oss-20b", messages=[system_message, user_message]
887
  )
888
  classification = response.choices[0].message.content.strip().lower()
889
  # Check if the LLM responded with 'Yes'